Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Journeyman III

can gpu do both kernel execution and buffer read/write at the same time/cycles?

suppose i call clEnqueueReadBuffer and clEnqueueNDRangeKernel, and they both do not wait for any events, and no blocking read/write.

my question is: will the amd gpu hardware do both of the instuctions at the same time, or sequentially one after another?

4 Replies

AMD HW will perform copy operation and kernel execution simultaneously provided the following terms are met:

1.) OpenCL SDK 2.6 and above .

2.) Set environment variable GPU_ASYNC_MEM_COPY=2.( The feature is currently in preview mode.)

3.) The copy and kernel execution commands should be set on different OpenCL queues.

Can events in different OpenCL queues be synced?

my problem is like that:

a gpu kernel function A

a gpu kernel function B must wait for A to complete

a read buffer function C that read the result from A

the code structure is something like that:

for (...;...;...) {

    kernel function A;

    kernel function B;    // B must wait for A

    C = read buffer from A;    // C must wait for A, and i would like C and B execute together

    if (C == ...) {    // the host judge whether to continue the loop or break



   // and in the next loop, A must wait for the previous loop to be completed


how the code should be written on different OpenCL queues?

and another question is that: the execution time of A and B on gpu is around 1ms, but the time they waiting in the queue to be executed is around 1ms too. i flush the queue but it does not work. how can i code my program to perform better?


pass event returned from enqueue of kernel A as wait event to enqueue B and C. you must also have all queues in one context.


so B and C should be set on different OpenCL queues, right?