Archives Discussions

sajis997 · ‎03-29-2013

Hi forum,

I believe the OpenCL equivalent of the CUDA block is work group. in CUDA we have to explicitly define the block size and i just heard from a lecture that in OpenCL we do not need to define the work group size and the most optimum is decided by the OpenCL itself. Is that really true ?

I believe that OpenCL still provides with the provision to define it .

How about some more discussion over this issue?

Regards

Sajjad

LeeHowes · ‎03-29-2013

Yes, that's roughly right. If you don't specify it yourself you don't know what the runtime will select and you may not know how much local memory to allocate for each work group.

On the other hand, 64 is almost always the right answer on AMD hardware in my experience. When you create a workgroup equal to the size of the hardware thread (wavefront) you remove synchronization overhead and gain performance overall.

himanshu_gautam · ‎03-30-2013

For all practical purpose, all OpenCL programmers specify the workgroup size when enqueuing a kernel. (just like how you do in CUDA).