Archives Discussions

atlemann · ‎05-02-2011

Hi!

I have a simulation with a huge 3D dataset which I cannot fit into a single GPU. I have a machine with 4 GPUs which I want to work together. I split the dataset into 4 sub-cubes and want only a single sub-cube to be allocated on each device. For each simulation step I have to communicate ghost layers between the devices. What is the best way to do this?

- Creating 1 context with 4 GPUs? Will all GPUs get all sub-cubes allocated, since buffer creation is for a context and not device?
- Creating 4 contexts with 1 GPU? It is possible to synchronize between contexts?

What commands should I use to transfer data directly from one GPU to another?

Sincerely,
Atle

nou · ‎05-02-2011

AMD OpenCl implementation allocate buffer on device on the first use. you can allocate 8GB of buffers. when you dont pass COPY_HOST_PTR then they are not allocated at all. when you do they are copied to host memory. they are allocated at device only when you enqueue kernel which use them.

and they stay there until you release then. if you enqueue kernel whcih use same buffer on multiple devices then this buffer is allocated on all used devices.

Archives Discussions

Splitting huge dataset across multiple GPUs and communicating