0 Replies Latest reply on Jul 21, 2014 9:13 AM by whlucas

    Overhead on data transfer between 2 GPUs or several OpenCL devices


      Hello everyone,

      I'm testing some stencil code on heterogeneous architectures by using 2 GPUs.

      In order to update memory data in different GPUs, I tried to  use the function clEnqueueWriteBufferRec and  clEnqueueReadBufferRec to transfer 1000 Bytes data from table A on GPU_1 to table A' on GPU_2.

      Then i found this phenomenon: the overhead of data transfer increases linearly with the size of table A (We only and always transfer 1000 Bytes data from table A!). I'd like to know if anyone has noticed that? any solution?