cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

ravikeshri
Journeyman III

Number of Work-items to get best performance

Hi,

Please let me know how do we decide the total number of global and local work-items to get the best performance from our OpenCL kernel? Is it dependent on the total number of Processing Elements in the GPU?

Thanks,

Ravi

0 Likes
2 Replies
genaganna
Journeyman III

Originally posted by: ravikeshri Hi,

Please let me know how do we decide the total number of global and local work-items to get the best performance from our OpenCL kernel? Is it dependent on the total number of Processing Elements in the GPU?

            Make sure local work group size is multiples of wavefront size. i.e localWorkGroupSize = 3 or more * WavefrontSize.

            Make sure global work group size is multiples of local work group size *  2 or more * No of compute units of Device.

 

For more details, Read chapter 4 of ATI_Stream_SDK_Programming_Guide.pdf

 

 

0 Likes

Thank you very much for the answer and also for the reference - genaganna. I will go through the guide now which seems to be very informative.

Thanks again!

Ravi

0 Likes