I usually use OpenCL on GPU. I want to do it on CPU, but I have a question:
If the number of CUs on CPU is "n", what's the usual number of the global size and local size?
I mean usually how many work items in one work group? And how many work groups are there? If the CPU contains 4 CU.