AnsweredAssumed Answered

How branch affect the work-items in one wavefront

Question asked by acekiller on Mar 27, 2014
Latest reply on Mar 28, 2014 by realhet

In OpenCL, a wavefront containing 64 work-items is scheduled each time. As all work-items work in lock-step manner, so even one work-item is delayed (encounter cache miss or else), then all other work-items have to wait for that one. Then what confuse me is that: because in actual scheuduling process, a quarter of the wavefront (i.e. 16 work-items) is scheduled onto GPU cores in one cycle, and the whole wavefront will be executed in 4 consequent cycles.

1) One work-item in the first quarter is delayed, all other three quarters will be delayed?

2) If only one work-item from the second quarter is delayed, then the first quarter will be not delayed, but the 3rd and 4th quarter will be delayed?



Is that true on AMD GPUs?