cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

cuorematto
Journeyman III

Multi GPU Cluster stream computing

Multi GPU Cluster stream computing

Hi all
I have a question;
Is better have many gpu on a single matherboard to make parallel calculation or to have distributed node using a Gigabit Lan for each motherboard with a pciX GPU and using MPICH2 or LAM MPI message passing inteface programs ?
I have seen on AMD stream computing web page a PDF documentation
about a program called PGI compiler wich support LAM MPI and MPICH2 and cal/brook+
GPU programming libraries. i need more information about this and how to programming with brook+ language on sdk CAL SDK
where can i find a documentation on the data type of the brook+ langage
for example int,chat and supported function call ?
*beer*
0 Likes
1 Reply

Unfortunately, it isn't that easy to say which one is going to be better. It is somewhat application and dataflow dependent. Do you have a particular application/dataflow you are thinking about?

For example, while using a GigE-linked cluster will mean communication between GPUs on separate nodes will be slower than on a multi-GPU system, it does mean that you have now have a CPU dedicated to processing data and feeding the GPU whereas in a multi-GPU setup, you have a single GPU trying to coordinate and feed data to 2 GPUs (once again, how much effect this has depends on the application). Also, you have to consider that with a cluster, you will end up having multiple disk controllers and disks servicing your application as well. Of course, this only has an effect if you end up needing to come from or go to disk often in your application.

If what you need to do on the CPU isn't very much (i.e. you aren't reordering data or something else with the CPU) then having a multi-GPU setup will probably allow you to finish your computation faster.

If you CPU is effectively maxed out trying to feed your GPUs and it is maxed out trying to preprocess data, for example, then having a cluster might be better.

Once again, depends on the application. If you can let me know what your general dataflow and computation is, it will lead to a more concrete recommendation.

Have you downloaded the SDK from the website? If so, there are doc directories underneath Brook+ and CAL. We are working on improving the documentation but take a look and let me know if you have any questions.

Michael.
0 Likes