We know how the concepts in OpenCL maps to GPU hardwares, but how about CPU?
In the case of A10-7850K, the "max work group size" is 1024, does this number has any hardware meaning?
When one group stalls, will another group switch in? If yes, how is it implemented?
Thanks in advance.