Kaveri has arrived, and as we all know there is some sharing of resources in the architecture of Kaveri and their predecessors. For one, two cores essentially share one floating point pipeline. I am not sure if this was done to "safe" space or not or if part of the thinking was that the GPU could do some of those floating point operations. If the latter was part of the thinking process, I am then wondering how one could instruct the GPU to do these floating point operations even if they are just a bunch of one-offs. I should say that I am aware of OpenCL 1.2. OpenCl 1.2 is not an option because:
- Much more overhead would be created spawning off a task, copying the data to the GPU, collecting the data and then closing off the task.
Is there currently a low-overhead way to say "GPU you take this task"? Kaveri, afterall, now has the same access to the memory as the CPU.
The other thing I am concerned about is how I can do such a thing and still keep the software universally running on non HSA enabled hardware?
well Kaveri is first APU which have unified memory space between CPU and GPU part. that mean you don't need copy data between CPU and GPU. this is main overhead in GPGPU computing. starting the task is not that much overhead if you have data already in place. i think we will need wait for next version of AMD APP SDK to get proper support for HSA platform.