Does wave / subgroup need synchronization for shared variables?

Multi tool use
Multi tool use












0















I am wondering if within a same wave / subgroup (warp?) we need to call memoryBarrierShared and barrier to synchronize shared variable? In NVIDIA I think it is not necessary, but I do not know for other IHVs.



EDIT : ballot



Since I am talking about wave / subgroup, I am talking about the ARB_shader_ballot extension.



Let's say we have such code (1) :



shared uint s_data[128];
uint tid = gl_GlobalInvocationID.x;
// initialization of some s_data
memoryBarrierShared();
barrier();
if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1)
s_data[tid] += s_data[tid + i];
}


According to me, this code is not correct. The correct one, according to the spec, would be (2):



if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1) {
s_data[tid] += s_data[tid + i];
memoryBarrierShared();
barrier();
}
}


However, since invocations run in parallel within a wave/subgroup, the barrier function seems to be useless : this one should be correct as well and faster than the second (3) :



if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1) {
s_data[tid] += s_data[tid + i];
memoryBarrierShared();
}
}


However, since we do not need barrier function, I wonder if (1) is correct, even if it is unlikely for me, and if not, if (3) is correct (that would means that my understanding is correct)



EDIT : int to uint, and change = to +=










share|improve this question




















  • 1





    "According to me, this code is not correct." Well, what exactly is it supposed to do? I don't understand what your code is intended to accomplish. I have no idea what s_data is, what values it has, or what it is intended to eventually store. And since all versions of your code exhibit UB, it's not clear what is supposed to be happening here.

    – Nicol Bolas
    Jan 3 at 15:36













  • The idea of my code is to accomplish a reduction. (I wanted to write += instead of =). s_data is only "values". What UB do my codes have?

    – Antoine Morrier
    Jan 3 at 15:55






  • 1





    In every case, you have invocations reading from memory that some other invocation will write to with no barriers between them to provide ordering/visibility. Even in your case 2, an invocation where tid == 1 will write to a variable that the tid == 0 invocation reads from. That's undefined behavior, whether shader_ballot exists or not.

    – Nicol Bolas
    Jan 3 at 16:00






  • 1





    @AntoineMorrier ARB_shader_ballot must define a groupsize, but that is not it's purpose. shader_ballot makes no guarantees about the underlying architecture beyond the fact that ballotARB works if the vendor has implemented the extension. Unrolling the last warp works because all other warps are free to do other work with in a Streaming Multiprocessor (NV specific) but also relies on undefined behavior EVEN ON NVIDIA GPUS to carry out adding values simultaneously accumulated from each warp. (cont.)

    – opa
    Jan 3 at 16:34






  • 1





    @AntoineMorrier 1) Yes, the code is UB on Nvidia GPUs. 2) Yes, I was talking about the code in the article. There either should be a __syncwarp extension provided for GLSL, or it should be built into other primitives provided by extension, for example ballotARB internally may just be the __ballot_sync cuda function on Nvidia gpus, which performs ballot and syncs the warp ensuring safe result.

    – opa
    Jan 3 at 17:57


















0















I am wondering if within a same wave / subgroup (warp?) we need to call memoryBarrierShared and barrier to synchronize shared variable? In NVIDIA I think it is not necessary, but I do not know for other IHVs.



EDIT : ballot



Since I am talking about wave / subgroup, I am talking about the ARB_shader_ballot extension.



Let's say we have such code (1) :



shared uint s_data[128];
uint tid = gl_GlobalInvocationID.x;
// initialization of some s_data
memoryBarrierShared();
barrier();
if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1)
s_data[tid] += s_data[tid + i];
}


According to me, this code is not correct. The correct one, according to the spec, would be (2):



if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1) {
s_data[tid] += s_data[tid + i];
memoryBarrierShared();
barrier();
}
}


However, since invocations run in parallel within a wave/subgroup, the barrier function seems to be useless : this one should be correct as well and faster than the second (3) :



if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1) {
s_data[tid] += s_data[tid + i];
memoryBarrierShared();
}
}


However, since we do not need barrier function, I wonder if (1) is correct, even if it is unlikely for me, and if not, if (3) is correct (that would means that my understanding is correct)



EDIT : int to uint, and change = to +=










share|improve this question




















  • 1





    "According to me, this code is not correct." Well, what exactly is it supposed to do? I don't understand what your code is intended to accomplish. I have no idea what s_data is, what values it has, or what it is intended to eventually store. And since all versions of your code exhibit UB, it's not clear what is supposed to be happening here.

    – Nicol Bolas
    Jan 3 at 15:36













  • The idea of my code is to accomplish a reduction. (I wanted to write += instead of =). s_data is only "values". What UB do my codes have?

    – Antoine Morrier
    Jan 3 at 15:55






  • 1





    In every case, you have invocations reading from memory that some other invocation will write to with no barriers between them to provide ordering/visibility. Even in your case 2, an invocation where tid == 1 will write to a variable that the tid == 0 invocation reads from. That's undefined behavior, whether shader_ballot exists or not.

    – Nicol Bolas
    Jan 3 at 16:00






  • 1





    @AntoineMorrier ARB_shader_ballot must define a groupsize, but that is not it's purpose. shader_ballot makes no guarantees about the underlying architecture beyond the fact that ballotARB works if the vendor has implemented the extension. Unrolling the last warp works because all other warps are free to do other work with in a Streaming Multiprocessor (NV specific) but also relies on undefined behavior EVEN ON NVIDIA GPUS to carry out adding values simultaneously accumulated from each warp. (cont.)

    – opa
    Jan 3 at 16:34






  • 1





    @AntoineMorrier 1) Yes, the code is UB on Nvidia GPUs. 2) Yes, I was talking about the code in the article. There either should be a __syncwarp extension provided for GLSL, or it should be built into other primitives provided by extension, for example ballotARB internally may just be the __ballot_sync cuda function on Nvidia gpus, which performs ballot and syncs the warp ensuring safe result.

    – opa
    Jan 3 at 17:57
















0












0








0








I am wondering if within a same wave / subgroup (warp?) we need to call memoryBarrierShared and barrier to synchronize shared variable? In NVIDIA I think it is not necessary, but I do not know for other IHVs.



EDIT : ballot



Since I am talking about wave / subgroup, I am talking about the ARB_shader_ballot extension.



Let's say we have such code (1) :



shared uint s_data[128];
uint tid = gl_GlobalInvocationID.x;
// initialization of some s_data
memoryBarrierShared();
barrier();
if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1)
s_data[tid] += s_data[tid + i];
}


According to me, this code is not correct. The correct one, according to the spec, would be (2):



if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1) {
s_data[tid] += s_data[tid + i];
memoryBarrierShared();
barrier();
}
}


However, since invocations run in parallel within a wave/subgroup, the barrier function seems to be useless : this one should be correct as well and faster than the second (3) :



if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1) {
s_data[tid] += s_data[tid + i];
memoryBarrierShared();
}
}


However, since we do not need barrier function, I wonder if (1) is correct, even if it is unlikely for me, and if not, if (3) is correct (that would means that my understanding is correct)



EDIT : int to uint, and change = to +=










share|improve this question
















I am wondering if within a same wave / subgroup (warp?) we need to call memoryBarrierShared and barrier to synchronize shared variable? In NVIDIA I think it is not necessary, but I do not know for other IHVs.



EDIT : ballot



Since I am talking about wave / subgroup, I am talking about the ARB_shader_ballot extension.



Let's say we have such code (1) :



shared uint s_data[128];
uint tid = gl_GlobalInvocationID.x;
// initialization of some s_data
memoryBarrierShared();
barrier();
if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1)
s_data[tid] += s_data[tid + i];
}


According to me, this code is not correct. The correct one, according to the spec, would be (2):



if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1) {
s_data[tid] += s_data[tid + i];
memoryBarrierShared();
barrier();
}
}


However, since invocations run in parallel within a wave/subgroup, the barrier function seems to be useless : this one should be correct as well and faster than the second (3) :



if(tid < gl_SubGroupSizeARB) {
for(uint i = gl_SubGroupeSizeARB; i > 0; i>>=1) {
s_data[tid] += s_data[tid + i];
memoryBarrierShared();
}
}


However, since we do not need barrier function, I wonder if (1) is correct, even if it is unlikely for me, and if not, if (3) is correct (that would means that my understanding is correct)



EDIT : int to uint, and change = to +=







opengl glsl vulkan






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 3 at 15:51







Antoine Morrier

















asked Jan 3 at 9:01









Antoine MorrierAntoine Morrier

2,102721




2,102721








  • 1





    "According to me, this code is not correct." Well, what exactly is it supposed to do? I don't understand what your code is intended to accomplish. I have no idea what s_data is, what values it has, or what it is intended to eventually store. And since all versions of your code exhibit UB, it's not clear what is supposed to be happening here.

    – Nicol Bolas
    Jan 3 at 15:36













  • The idea of my code is to accomplish a reduction. (I wanted to write += instead of =). s_data is only "values". What UB do my codes have?

    – Antoine Morrier
    Jan 3 at 15:55






  • 1





    In every case, you have invocations reading from memory that some other invocation will write to with no barriers between them to provide ordering/visibility. Even in your case 2, an invocation where tid == 1 will write to a variable that the tid == 0 invocation reads from. That's undefined behavior, whether shader_ballot exists or not.

    – Nicol Bolas
    Jan 3 at 16:00






  • 1





    @AntoineMorrier ARB_shader_ballot must define a groupsize, but that is not it's purpose. shader_ballot makes no guarantees about the underlying architecture beyond the fact that ballotARB works if the vendor has implemented the extension. Unrolling the last warp works because all other warps are free to do other work with in a Streaming Multiprocessor (NV specific) but also relies on undefined behavior EVEN ON NVIDIA GPUS to carry out adding values simultaneously accumulated from each warp. (cont.)

    – opa
    Jan 3 at 16:34






  • 1





    @AntoineMorrier 1) Yes, the code is UB on Nvidia GPUs. 2) Yes, I was talking about the code in the article. There either should be a __syncwarp extension provided for GLSL, or it should be built into other primitives provided by extension, for example ballotARB internally may just be the __ballot_sync cuda function on Nvidia gpus, which performs ballot and syncs the warp ensuring safe result.

    – opa
    Jan 3 at 17:57
















  • 1





    "According to me, this code is not correct." Well, what exactly is it supposed to do? I don't understand what your code is intended to accomplish. I have no idea what s_data is, what values it has, or what it is intended to eventually store. And since all versions of your code exhibit UB, it's not clear what is supposed to be happening here.

    – Nicol Bolas
    Jan 3 at 15:36













  • The idea of my code is to accomplish a reduction. (I wanted to write += instead of =). s_data is only "values". What UB do my codes have?

    – Antoine Morrier
    Jan 3 at 15:55






  • 1





    In every case, you have invocations reading from memory that some other invocation will write to with no barriers between them to provide ordering/visibility. Even in your case 2, an invocation where tid == 1 will write to a variable that the tid == 0 invocation reads from. That's undefined behavior, whether shader_ballot exists or not.

    – Nicol Bolas
    Jan 3 at 16:00






  • 1





    @AntoineMorrier ARB_shader_ballot must define a groupsize, but that is not it's purpose. shader_ballot makes no guarantees about the underlying architecture beyond the fact that ballotARB works if the vendor has implemented the extension. Unrolling the last warp works because all other warps are free to do other work with in a Streaming Multiprocessor (NV specific) but also relies on undefined behavior EVEN ON NVIDIA GPUS to carry out adding values simultaneously accumulated from each warp. (cont.)

    – opa
    Jan 3 at 16:34






  • 1





    @AntoineMorrier 1) Yes, the code is UB on Nvidia GPUs. 2) Yes, I was talking about the code in the article. There either should be a __syncwarp extension provided for GLSL, or it should be built into other primitives provided by extension, for example ballotARB internally may just be the __ballot_sync cuda function on Nvidia gpus, which performs ballot and syncs the warp ensuring safe result.

    – opa
    Jan 3 at 17:57










1




1





"According to me, this code is not correct." Well, what exactly is it supposed to do? I don't understand what your code is intended to accomplish. I have no idea what s_data is, what values it has, or what it is intended to eventually store. And since all versions of your code exhibit UB, it's not clear what is supposed to be happening here.

– Nicol Bolas
Jan 3 at 15:36







"According to me, this code is not correct." Well, what exactly is it supposed to do? I don't understand what your code is intended to accomplish. I have no idea what s_data is, what values it has, or what it is intended to eventually store. And since all versions of your code exhibit UB, it's not clear what is supposed to be happening here.

– Nicol Bolas
Jan 3 at 15:36















The idea of my code is to accomplish a reduction. (I wanted to write += instead of =). s_data is only "values". What UB do my codes have?

– Antoine Morrier
Jan 3 at 15:55





The idea of my code is to accomplish a reduction. (I wanted to write += instead of =). s_data is only "values". What UB do my codes have?

– Antoine Morrier
Jan 3 at 15:55




1




1





In every case, you have invocations reading from memory that some other invocation will write to with no barriers between them to provide ordering/visibility. Even in your case 2, an invocation where tid == 1 will write to a variable that the tid == 0 invocation reads from. That's undefined behavior, whether shader_ballot exists or not.

– Nicol Bolas
Jan 3 at 16:00





In every case, you have invocations reading from memory that some other invocation will write to with no barriers between them to provide ordering/visibility. Even in your case 2, an invocation where tid == 1 will write to a variable that the tid == 0 invocation reads from. That's undefined behavior, whether shader_ballot exists or not.

– Nicol Bolas
Jan 3 at 16:00




1




1





@AntoineMorrier ARB_shader_ballot must define a groupsize, but that is not it's purpose. shader_ballot makes no guarantees about the underlying architecture beyond the fact that ballotARB works if the vendor has implemented the extension. Unrolling the last warp works because all other warps are free to do other work with in a Streaming Multiprocessor (NV specific) but also relies on undefined behavior EVEN ON NVIDIA GPUS to carry out adding values simultaneously accumulated from each warp. (cont.)

– opa
Jan 3 at 16:34





@AntoineMorrier ARB_shader_ballot must define a groupsize, but that is not it's purpose. shader_ballot makes no guarantees about the underlying architecture beyond the fact that ballotARB works if the vendor has implemented the extension. Unrolling the last warp works because all other warps are free to do other work with in a Streaming Multiprocessor (NV specific) but also relies on undefined behavior EVEN ON NVIDIA GPUS to carry out adding values simultaneously accumulated from each warp. (cont.)

– opa
Jan 3 at 16:34




1




1





@AntoineMorrier 1) Yes, the code is UB on Nvidia GPUs. 2) Yes, I was talking about the code in the article. There either should be a __syncwarp extension provided for GLSL, or it should be built into other primitives provided by extension, for example ballotARB internally may just be the __ballot_sync cuda function on Nvidia gpus, which performs ballot and syncs the warp ensuring safe result.

– opa
Jan 3 at 17:57







@AntoineMorrier 1) Yes, the code is UB on Nvidia GPUs. 2) Yes, I was talking about the code in the article. There either should be a __syncwarp extension provided for GLSL, or it should be built into other primitives provided by extension, for example ballotARB internally may just be the __ballot_sync cuda function on Nvidia gpus, which performs ballot and syncs the warp ensuring safe result.

– opa
Jan 3 at 17:57














1 Answer
1






active

oldest

votes


















3














The execution model shared by OpenGL and Vulkan with regard to compute shaders does not really recognize the concept of a "wave". It has the concept of a work group, but that is not the same thing. A work group can be much bigger than a GPU "wave", and for small work groups, multiple work groups could be executing on the same GPU "wave".



As such, these specifications make no statements about the behavior of any of its functions with regard to a "wave" (with the exception of shader ballot functions). So if you want synchronization that the standard says will work on all conforming implementations, you must call both functions as dictated by the standard.



Even with ARB_shader_ballot, its behavior does not modify the execution model of shaders. It only allows cross-communication between subgroups, and only via the explicit mechanisms that it provides.



The execution model and memory model of shader invocations is that they are unordered with respect to each other, unless you explicitly order them with barriers.






share|improve this answer


























  • I am talking about the shader ballot functions :). I am talking about it because I want to optimize my code by using this extension :)

    – Antoine Morrier
    Jan 3 at 14:25






  • 3





    @AntoineMorrier: No, you aren't. You mentioned shared variables, barrier and memoryBarrierShared. Nowhere did you bring up shader ballot stuff. So you should fix your question to ask about what you wanted to know about, preferably with some source code.

    – Nicol Bolas
    Jan 3 at 14:26











  • I edited the question and add some source code :)

    – Antoine Morrier
    Jan 3 at 14:52











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54019084%2fdoes-wave-subgroup-need-synchronization-for-shared-variables%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









3














The execution model shared by OpenGL and Vulkan with regard to compute shaders does not really recognize the concept of a "wave". It has the concept of a work group, but that is not the same thing. A work group can be much bigger than a GPU "wave", and for small work groups, multiple work groups could be executing on the same GPU "wave".



As such, these specifications make no statements about the behavior of any of its functions with regard to a "wave" (with the exception of shader ballot functions). So if you want synchronization that the standard says will work on all conforming implementations, you must call both functions as dictated by the standard.



Even with ARB_shader_ballot, its behavior does not modify the execution model of shaders. It only allows cross-communication between subgroups, and only via the explicit mechanisms that it provides.



The execution model and memory model of shader invocations is that they are unordered with respect to each other, unless you explicitly order them with barriers.






share|improve this answer


























  • I am talking about the shader ballot functions :). I am talking about it because I want to optimize my code by using this extension :)

    – Antoine Morrier
    Jan 3 at 14:25






  • 3





    @AntoineMorrier: No, you aren't. You mentioned shared variables, barrier and memoryBarrierShared. Nowhere did you bring up shader ballot stuff. So you should fix your question to ask about what you wanted to know about, preferably with some source code.

    – Nicol Bolas
    Jan 3 at 14:26











  • I edited the question and add some source code :)

    – Antoine Morrier
    Jan 3 at 14:52
















3














The execution model shared by OpenGL and Vulkan with regard to compute shaders does not really recognize the concept of a "wave". It has the concept of a work group, but that is not the same thing. A work group can be much bigger than a GPU "wave", and for small work groups, multiple work groups could be executing on the same GPU "wave".



As such, these specifications make no statements about the behavior of any of its functions with regard to a "wave" (with the exception of shader ballot functions). So if you want synchronization that the standard says will work on all conforming implementations, you must call both functions as dictated by the standard.



Even with ARB_shader_ballot, its behavior does not modify the execution model of shaders. It only allows cross-communication between subgroups, and only via the explicit mechanisms that it provides.



The execution model and memory model of shader invocations is that they are unordered with respect to each other, unless you explicitly order them with barriers.






share|improve this answer


























  • I am talking about the shader ballot functions :). I am talking about it because I want to optimize my code by using this extension :)

    – Antoine Morrier
    Jan 3 at 14:25






  • 3





    @AntoineMorrier: No, you aren't. You mentioned shared variables, barrier and memoryBarrierShared. Nowhere did you bring up shader ballot stuff. So you should fix your question to ask about what you wanted to know about, preferably with some source code.

    – Nicol Bolas
    Jan 3 at 14:26











  • I edited the question and add some source code :)

    – Antoine Morrier
    Jan 3 at 14:52














3












3








3







The execution model shared by OpenGL and Vulkan with regard to compute shaders does not really recognize the concept of a "wave". It has the concept of a work group, but that is not the same thing. A work group can be much bigger than a GPU "wave", and for small work groups, multiple work groups could be executing on the same GPU "wave".



As such, these specifications make no statements about the behavior of any of its functions with regard to a "wave" (with the exception of shader ballot functions). So if you want synchronization that the standard says will work on all conforming implementations, you must call both functions as dictated by the standard.



Even with ARB_shader_ballot, its behavior does not modify the execution model of shaders. It only allows cross-communication between subgroups, and only via the explicit mechanisms that it provides.



The execution model and memory model of shader invocations is that they are unordered with respect to each other, unless you explicitly order them with barriers.






share|improve this answer















The execution model shared by OpenGL and Vulkan with regard to compute shaders does not really recognize the concept of a "wave". It has the concept of a work group, but that is not the same thing. A work group can be much bigger than a GPU "wave", and for small work groups, multiple work groups could be executing on the same GPU "wave".



As such, these specifications make no statements about the behavior of any of its functions with regard to a "wave" (with the exception of shader ballot functions). So if you want synchronization that the standard says will work on all conforming implementations, you must call both functions as dictated by the standard.



Even with ARB_shader_ballot, its behavior does not modify the execution model of shaders. It only allows cross-communication between subgroups, and only via the explicit mechanisms that it provides.



The execution model and memory model of shader invocations is that they are unordered with respect to each other, unless you explicitly order them with barriers.







share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 3 at 16:02

























answered Jan 3 at 14:21









Nicol BolasNicol Bolas

290k34481657




290k34481657













  • I am talking about the shader ballot functions :). I am talking about it because I want to optimize my code by using this extension :)

    – Antoine Morrier
    Jan 3 at 14:25






  • 3





    @AntoineMorrier: No, you aren't. You mentioned shared variables, barrier and memoryBarrierShared. Nowhere did you bring up shader ballot stuff. So you should fix your question to ask about what you wanted to know about, preferably with some source code.

    – Nicol Bolas
    Jan 3 at 14:26











  • I edited the question and add some source code :)

    – Antoine Morrier
    Jan 3 at 14:52



















  • I am talking about the shader ballot functions :). I am talking about it because I want to optimize my code by using this extension :)

    – Antoine Morrier
    Jan 3 at 14:25






  • 3





    @AntoineMorrier: No, you aren't. You mentioned shared variables, barrier and memoryBarrierShared. Nowhere did you bring up shader ballot stuff. So you should fix your question to ask about what you wanted to know about, preferably with some source code.

    – Nicol Bolas
    Jan 3 at 14:26











  • I edited the question and add some source code :)

    – Antoine Morrier
    Jan 3 at 14:52

















I am talking about the shader ballot functions :). I am talking about it because I want to optimize my code by using this extension :)

– Antoine Morrier
Jan 3 at 14:25





I am talking about the shader ballot functions :). I am talking about it because I want to optimize my code by using this extension :)

– Antoine Morrier
Jan 3 at 14:25




3




3





@AntoineMorrier: No, you aren't. You mentioned shared variables, barrier and memoryBarrierShared. Nowhere did you bring up shader ballot stuff. So you should fix your question to ask about what you wanted to know about, preferably with some source code.

– Nicol Bolas
Jan 3 at 14:26





@AntoineMorrier: No, you aren't. You mentioned shared variables, barrier and memoryBarrierShared. Nowhere did you bring up shader ballot stuff. So you should fix your question to ask about what you wanted to know about, preferably with some source code.

– Nicol Bolas
Jan 3 at 14:26













I edited the question and add some source code :)

– Antoine Morrier
Jan 3 at 14:52





I edited the question and add some source code :)

– Antoine Morrier
Jan 3 at 14:52




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54019084%2fdoes-wave-subgroup-need-synchronization-for-shared-variables%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







JN08xBJ wRjT,1Xx2rJZ wCQsFmQ6zqil4 4Cyd YEoRjH8cKCosPuDGTfJZ p3ywe8jnXkop5a flupuSPVPIa7U
KgnJ2RKEpx,T9xmvB68RT Bg0Yw9Nm8x P,hQr1TrvHAr

Popular posts from this blog

Monofisismo

Angular Downloading a file using contenturl with Basic Authentication

Olmecas