I have read the post which said that this is a driver issue.
We have a scenario in which we need to do kernel processing on large amounts of data from the host memory.
How can we know if the memory transfer is done in parallel (by DMA) or by the command queue which is not parallel in that matter?
When should this issue be solved in case there is currently no DMA support?
clEnqueueWriteBuffer is an API to write data in a buffer.But it does not use DMA and hence memory transfer and kernel execution are serialized.
With DMA it would be possible to do them parallely.
Do you mean that this feature is already available in the hardware, just not supported by the drivers? Can we expect to see DMA enabled for hd 5870 etc in a future release?
it seems that the new HD6900 cards have asynchronous kernel dispatch and dual bidirectional DMA engines. How good that may sound compared to the < HD6900 cards?