Hi all, It seems to me that the current version of the AMD OpenCL runtime (AMD Radeon HD 6900 Series, driver 11.12, AMD-APP-SDK-v2.6 on linux 64) does not suppoer bidirectional pcie transfer on Linux. The following little test gives me the result below. I've created an out of order queue and I enqueue both a read and a write from/to two different buffers. I would expect the second transfer to be done in parallel with the first one if bidirectinoa pcie transfert would work. Any plan for supporting this feature in the near future? Or maybe can you tell me if there is a different way to achieve this? I've tried with different queues, different devices and so on. ///////////////////////////////////////////////////////////////////////////////////////// cl_command_queue ooo_queue = clCreateCommandQueue(context, device, CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE , &error); size_t cb = 32*1024*1024*1024; clFinish(ooo_queue); double start = get_time(); clEnqueueReadBuffer(ooo_queue, buffer1, CL_FALSE, 0, cb, data1, 0, NULL, NULL); if (write) clEnqueueWriteBuffer(ooo_queue, buffer2, CL_FALSE, 0, cb, data2, 0, NULL, NULL); clFinish(ooo_queue); double stop = get_time(); fprintf(stdout, "%fGB/s\n", stop-start, (cb/(stop-start_local))/1024/1024/1024); ///////////////////////////////////////////////////////////////////////////////////////// Just the read buffer(write == false): 7.687507GB/s Read + Write buffer (write == true): 3.665632GB/s
You do realise that 'out of order queues' are not out of order on current amd gpus? Well discussed in the various opencl forums.
Try using multiple queues instead (although i have no idea if it'll make any difference for this test). AFAICT from the docs nvidia requires multiple queues for concurrent operations too (my reading of section 3.2.2 of their opencl guide 4.1).