Latest reply on Apr 3, 2011 10:37 PM by tweenk

    Simultaneous data transfer and kernel execution


      I have a piece of OpenCL code where the data transfer (ReadBuffer / WriteBuffer) takes about the same time as the computation. I would like to allocate 2 input and 2 output buffers and run the kernel on one pair of buffers while I read/write the other pair. Is this possible in AMD's OpenCL implementation?

      I tried using an out-of-order queue, but I did not achieve a speedup over a synchronous version.