I am using OpenCL on my Apple Macbook pro with a GPGPU graphics card and Intel 2.66GHz core 2 duo and want to use OpenCL on CPU and GPU. it works fine on GPU and also on CPU except one problem while running OpenCL on CPU:
The work group size returned by OpenCL device query is 1 which means that there will be one thread in a thread block. So how could I do e.g. reduction operation and lot of other kernels where we need to have more work-items in a work-group even with CPU OpenCL implementation? Please tell me as I could not found any help on this??
Thanks in advance!