AnsweredAssumed Answered

concurrent execution of wavefront of 1 workgroup

Question asked by foomanchoo on Mar 7, 2013
Latest reply on Mar 7, 2013 by LeeHowes

good morning.


just to clarify: executing 4 workgroups with 1 wavefront each per CU is gonna be as fast as executing

1 workgroup of 256 work items? (ignoring minor overhead assiciated with workgroup scheduling).

i.e. is it ok to run 1 workgroup of size 256 per CU?


and if yes: if 4 wavefronts work on the the same 64 data paths, while sharing intermediate results via LDS,

the same work on 64 data paths will take 1/4 of the time compared to working on 256 data paths with one thread

doint all the work instead of 4 threads coordinating per data path via LDS? (ignoring the overhead that would

result from wavefronts not reaching barriers at exactly the same time).


also the wavefronts would be executing barrier instructions in synchronization with other threads but the wavefronts

would diverge and thus the barriers would not be at the same address in the instruction stream. is that legal?


thanks for helping.