Is there any way to have a write-shared (zero-copy) memory accessible from CPU and GPU at the same time, so that both GPU and CPU would be able to issue memory fence DURING the GPU kernel execution?
In other words, can I write an OpenCL code where a CPU writes data and GPU reads it consistently WITHOUT having to stop and start the GPU kernel?
I didn't find any such functionality documented in OpenCL, but maybe on APUs it's possible with lower-level non-OpenCL APIs?
Similarly, is there any way to make sure that data transferred into the GPU memory via OpenCL memory transfer calls, can be consistently read from the memory by a GPU kernel even if the transfer was done in parallel with the kernel execution? Again, on APUs it might work if the GPU snoops the memory bus. But does it?
Any pointer/example/ would be very much appreciated!
I think it's impossible at present, but it may be solved in the future. I'm just speculating. The attachment is the introduction about HSA. And hsail may make this come true.