cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

monoton
Journeyman III

Overlapping computation and memory transfers with OpenACC and the CAPS compiler

Hi,

I am currently working on overlapping my memory transfers (each transfer about 1 GB in size) with computation.

However, even when I use async memory transfers in OpenACC the profiler shows me that the OpenCL command queue runs in-line, blocking all other commands until the transfer is done. So I cannot do the computation concurrently (on another set of data previously brought into memory).

Is there a way to change to out-of-order? And would that resolve the issue? If not, how can I resolve the issue? I cannot fetch the command queue as that is a openacc 2.0 feature that is not yet implemented in the latest compiler. But even if I could, I am not sure if it is supported to do out-of-order.

Is there a way to set the default to out-of-order (preferably a environmental variable or something alike)? Is it supported by the GPU/runtime/SDK?

Is there another way to overlap the compute and memory transfers, if the above is not possible?

Thanks!

Best regards,

Olav

0 Likes
2 Replies
pinform
Staff

I am not sure you can set the default out-of-order execution, but overlapping compute and memory transfers might be possible. Can you post a simple test case?

0 Likes

Hi Olav,

Overlapping compute and data transfer is possible. You can refer the AsnycDataTransfer sample in APP SDK 2.9 that demonstrates this.

Regards

Pradeep

0 Likes