cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

anu10111988
Journeyman III

GPU Direct equivalent in AMD OpenCL

Hi,

          As in NVIDIA CUDA based GPU direct in which we can communicate with GPU and CPU or GPU and GPU with less latency , do we have anything equivalent to this with AMD OpenCL ? According to my understanding, Zero copy( with CL_MEM_ALLOC_HOST_PTR) in clCreateBuffer ,is between CPU and GPU . Can anyone tell me how to communicate between two GPUs without host interference in OPENCL ?

Thanks

0 Likes
7 Replies
LeeHowes
Staff

You don't need host interference. When you move data into OpenCL you move it into the context, not to a device. Movement between devices is something the runtime handles automatically (you can use clMigrateMemObject but that is a hint about when you want to move it, not a requirement to move it). In theory, then, the runtime can move data directly between devices. It may or may not do so depending on support in the driver stack.

0 Likes

The problem with this approach is that it handles this automatically and sometimes you require data to move into a specific device for efficiency in a controlled manner.

Consider any system that has strict latency requirements; In that case you need to know that performance will not change.

Hopefully OpenCL and/or AMD will respond to this amazing new feature with something equal or better.

0 Likes

OpenCL provides the ability to migrate memory objects between devices on demand (it's a request but there's no reason for a runtime to ignore it). What more would you want to see over that? Some sort of control over what latency the runtime is allowed to apply to movement?

0 Likes

Hi Phideas,

Can you describe your system a bit ?

Are you streaming data from network or something?? Are you looking for shared memory between your network drivers and GPU?

Actually, thinking about it, HSA could be the technology equivalent of GPUDirect.... and probably much more than that.

When all devices are united by coherent memory-spaces (thats what HSA is trying to achieve), it will be very easy to make Network, GPU and CPU all work on same buffer without needing to copy around.

0 Likes
himanshu_gautam
Grandmaster

I think GPU Direct is a very inifiniband specific feature (involving some driver hacks (qlogic driver/ mellanox driver)) especially in a "cluster environment".

There is no equivalent of this in OpenCL -as much as I know.

You just cant compare zero-copy with this.

0 Likes

I was thinking on a single machine, yes for machine to machine transfers that's a different problem entirely. The infiniband device could probably be constructed as an entity that can accept data but not run kernels within the OpenCL runtime and still hide it away, but that is not the case at this time.

0 Likes

hi,

AFAIK, intermachine communication is not possible using opencl as of now. But interestingly many universities have been working at providing such framework using MPI + OpenCL. Check out http://aces.snu.ac.kr/Center_for_Manycore_Programming/SnuCL.html for a similar project.

0 Likes