Hi,
I was trying to overlap data transfer and kernel execution on my Radeon HD 5970 to hide the overhead of data transfer. I therefore created two separate queues (one for data transfer and one for kernel execution) and used events to synchronize both.
However, I wasn't able to see any overlap when I looked at the events profiling information...
According to the ATI Stream Programming Guide it should be possible to do data transfer and GPU computation in parallel. Has anyone ever managed to achieve this?
Cheers, Dominik
*bump*
Any thoughts on this? Has anyone got experience with overlapping data transfer and computation on ATI GPUs??
To do that DMA should work and i don't think it does with SDK 2.1 in OpenCL.
I just thought that, because the AMD OpenCL Programming Guide talks about DMA transfers that can occur concurrently with kernel executions, it would be possible with the current SDK.
Can anyone confirm whether or not DMA transfers are possible with the 2.1 SDK??
DMA does not work with 2.1 SDK.
Thanks for this information.
Are you planning to support DMA in future releases?
Yes.