2 Replies Latest reply on Jul 23, 2010 4:33 PM by ravikeshri

    Number of Work-items to get best performance

    ravikeshri

      Hi,

      Please let me know how do we decide the total number of global and local work-items to get the best performance from our OpenCL kernel? Is it dependent on the total number of Processing Elements in the GPU?

      Thanks,

      Ravi

        • Number of Work-items to get best performance
          genaganna

           

          Originally posted by: ravikeshri Hi,

          Please let me know how do we decide the total number of global and local work-items to get the best performance from our OpenCL kernel? Is it dependent on the total number of Processing Elements in the GPU?

                      Make sure local work group size is multiples of wavefront size. i.e localWorkGroupSize = 3 or more * WavefrontSize.

                      Make sure global work group size is multiples of local work group size *  2 or more * No of compute units of Device.

           

          For more details, Read chapter 4 of ATI_Stream_SDK_Programming_Guide.pdf