I already use DirectGMA to upload from a frame grabber (Matrox) to W8100 GPU.
It is done by:
1. clCreateBuffer with CL_MEM_BUS_ADDRESSABLE_AMD
2. clEnqueueMakeBuffersResidentAMD which outputs
a. cl_mem buffer
b. { cl_ulong surface_bus_address; cl_ulong marker_bus_address; } cl_bus_address_amd;
Then the frame grabber is given the surface_bus_address, and output there directly. (somehow...)
Now I want to use the same technique to copy from 1 W8100 GPU to another. I considered the following:
Option 1:
* Create cl buffer and make it resident (1 and 2 above) on BOTH source and target
* clEnqueueCopyBuffer between source and target.
(would that copy use command queue created on the target device or on the source device?)
Option 2:
* Create cl buffer and make it resident (1 and 2 above ONLY ON THE TARGET)
* clEnqueueCopyBuffer from a non resident source cl-buffer to the TARGET.
Option 3:
* Same as option 1 but use memcpy(surface_bus_address_TARGET, surface_bus_address_SRC)
(instead of clEnqueueCopyBuffer)
Note:
The copy operation is done on a dedicated thread in a synchronous manner inside it, so I do not care about a "marker" or another way of synchronization.
Your help would be appreciated.
Thanks.