Archives Discussions

probing · ‎07-26-2010

my current implementation is that create 2 queue for a single GPU device,

one queue is only for memory transfer API such as:

clEnqueueReadBuffer, clEnqueueWriteBuffer or clEnqueueCopyBuffer.

another queue is only for GPU computing API such as clEnqueueNDRangeKernel

and I sync these 2 queues using shared cl_event objects when it is necessary.

But for my test, this can not make transfer and computing concurrent. does it mean that clEnqueueCopyBuffer and clEnqueueNDRangeKernel will execute serially even they are on different queues?

omkaranathan · ‎07-28-2010

Concurrent memory transfer and kernel execution can happen in case of single queue also. For this to happen DMA should be enabled, which not the case with current implementation.

probing · ‎08-02-2010

Thanks, is there any plan (in some future version?) of DMA enabling?

Originally posted by: omkaranathan Concurrent memory transfer and kernel execution can happen in case of single queue also. For this to happen DMA should be enabled, which not the case with current implementation.

omkaranathan · ‎08-02-2010

Yes, we are working on it.

ryta1203 · ‎08-04-2010

Originally posted by: omkaranathan Concurrent memory transfer and kernel execution can happen in case of single queue also. For this to happen DMA should be enabled, which not the case with current implementation.

So AMD's OpenCL compiler doesn't allow for async transfer? Odd.

MicahVillmow · ‎08-04-2010

ryta,
I don't he is refering to the async_copy functions but EnqueueRead/Write buffer.

ryta1203 · ‎08-05-2010

Micah,

From the little OpenCL I've done I though that unless you waited on these routines that they would be async? I assume this is not correct, thanks, good to know if I want to try and pipeline my data/execution.

probing · ‎08-06-2010

Originally posted by: MicahVillmow ryta, I don't he is refering to the async_copy functions but EnqueueRead/Write buffer.

Micah, you mean command of clEnqueueCopyBuffer can run concurrently with command of clEnqueueNDRangeKernel? while clEnqueueReadBuffer can not?

Thanks

MicahVillmow · ‎08-05-2010

On architectures that support async data copies, they will be asynchronous, otherwise they will not be.
7XX and Evergreen hardware does not support async kernel copies.
Micah

ryta1203 · ‎08-06-2010

Originally posted by: MicahVillmow On architectures that support async data copies, they will be asynchronous, otherwise they will not be. 7XX and Evergreen hardware does not support async kernel copies. Micah

Interesting, this means essentially in all your "current" hardware that there exists no way to do an async data transfer, so no "pipelining" can occcur?

I thought this was possible in CAL (though I've never tried it) through DMA?

I'm confused, sorry, lol.

MicahVillmow · ‎08-06-2010

ryta,
I"m talking about the kernel functions async_copy_*, not the the OpenCL API calls for Enqueue commands. That being done asynchronously is possible

ryta1203 · ‎08-06-2010

Ok, thought so, thanks.

Archives Discussions

How can I make memory transfer and GPU computing concurrent