AnsweredAssumed Answered

How branch affect the work-items in one wavefront

Question asked by acekiller on Mar 27, 2014
Latest reply on Mar 28, 2014 by realhet

In OpenCL, a wavefront containing 64 work-items is scheduled each time. As all work-items work in lock-step manner, so even one work-item is delayed (encounter cache miss or else), then all other work-items have to wait for that one. Then what confuse me is that: because in actual scheuduling process, a quarter of the wavefront (i.e. 16 work-items) is scheduled onto GPU cores in one cycle, and the whole wavefront will be executed in 4 consequent cycles.

1) One work-item in the first quarter is delayed, all other three quarters will be delayed?

2) If only one work-item from the second quarter is delayed, then the first quarter will be not delayed, but the 3rd and 4th quarter will be delayed?

 

 

Is that true on AMD GPUs?

Outcomes