Concurrent kernel execution and OpenCL device partition

Question asked by acekiller on May 15, 2013
Recently I needed to do some experiments which need run multiple different kernel on AMD hardware. But I have several questions before starting to coding hence I really need your help.


First, I am not quite sure whether AMD HW can support concurrent kernel execution on one device. Because when I refer to the OpenCL specs, they said the command queue can be created as in-order and out-of-order. But I don't "out-of-order" mean "concurrent execution". Is there anyone know info about this? My hardware is AMD APU A8 3870k. If this processor does not support, any other AMD products support?


Second, I know there is an extension "device fission" which can be used to partition one device into two devices. This works only on CPU now. But in OpenCL specs, I saw something, i.e. "clcreatesubdevice", which is also used to partition one device into two? So my question is is there any difference between these two techniques? My understanding is: device fission can only be used on CPU, clcreatesubdevice can be used on both the CPU and the GPU. Is that correct?


Thanks for any kind reply!