cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

What's the maximum number of ATI GPUs officially supported by Windows drivers? #2

As first topic got corrupted

I cannot no longer see the whole original topic I've created:

http://forums.amd.com/devforum/messageview.cfm?catid=328&threadid=141888&enterthread=y

As some xml tags spawns in the middle and Opera and IE9 refusing to show page properly, so I'm re-quoting everything here:

Originally posted by: empty_knapsackYear ago it wasn't possible to use more than 4. I've seen some reports about working 3x5970 configs since then but under Linux.
What's about Windows? Was 4x limit been increased finally?
One of our users trying to setup 3x5970 system and even with dummy plugs/monitors attached to each of 5970 DVI ports it fails to operate properly -- system simply lock ups after several seconds of work. There no problem with PSU, so this isn't the case.


Originally posted by: himanshu.gautamI can confirm Working with 8 GPUs in a system on both linux and windows platform.
-------------------------
Himanshu


Originally posted by: empty_knapsackCan you please provide more details about this system?
What motherboards was used, what GPUs? Which version of Catalyst was installed, are there any dummy plugs were used to make all GPUs recognized by Windows? Which software you've used to make sure that all 8 GPUs in fact works and producing correct results?


Originally posted by: himanshu.gautamempty_knapsack,
The Linux system had following configuration:
OS: Red Hat 5.3 64-bit
ATI driver: Catalyst 10.4
CPU: Two Quad-Core AMD Opteron
Chipset: NVIDIA nForce Professional 3600 and 3050
RAM: 32 GB DDR2
-------------------------
Himanshu


Originally posted by: afoHi,
It's nice to hear that, but this raises a question: what performance do you have compared with 8 GPUs in 8 different computers? And as others asked: what boards were used?
best regards,
Alfonso


Originally posted by: himanshu.gautamempty_knapsack,
We plugged 8 monitors to these GPUs. CAL's FindNumDevices as well as OpenCL showed 8 GPUs. We also ran a OpenCL reduction program that used all 8 GPUs and got correct result
-------------------------
Himanshu



Now, as Alfonso, I'm very interested to know which GPUs were used and are they working simultaneously for at least several seconds (not as most CAL/OpenCL samples which works just milliseconds) and all producing correct results?

Also my original post was related to Windows (though Linux support is also interesting) and I'm quite curious which drivers you've used for it, as Catalyst 10.4 was completely broken for Windows and 5970. 2nd core was detected and even can be used for calculations but it runs much slower and produced just garbage instead of real results.

Also, are you absolutely sure that OpenCL test you've used was correct? I'm asked this because earlier you was suggesting to use SimpleMultiDevice /e as test:

http://forums.amd.com/devforum/messageview.cfm?catid=328&threadid=139608&enterthread=y

but later it ends as this test reports OK for any results because of programming error inside.

 

0 Likes
17 Replies
afo
Adept I

Hi,

This is just my tests and guessing, so take it carefully...

I compared the performance of SDK2.1 and 2.2 using multigpu (2xHD5970) And I see that the differences in performance are related to the way that the OpenCL runtime manages the threads for different GPUs (kernel time is the same for different SDKs). For example, the cpu load for sdk 2.1 is lower than cpu load for sdk 2.2, but the performance for sdk 2.1 is better than the performance of sdk 2.2 for the same multigpu application (both giving correct results for the 4 GPUs)

Besides that, page 2-4 of the OpenCL programming guide states that the different queues for each GPU merge into one queue for all GPUs, so the scheduling of that queue would have a very big impact in performance for multigpu. For example, does that queue reorder the items to group memory transfers or kernel calls for a specific GPU? that has a impact in multigpu performance, so knowing/controlling the scheduling of the GPU queue would allows us to improve the performance of multigpu applications.

best regards,

Alfonso

0 Likes
mrbpix
Journeyman III

I hope to settle down the question for once: at least 8 GPUs are supported with the 10.11 drivers and 2.2 SDK on Linux 64-bit, when developing on top of CAL, not OpenCL. In fact I have written and released whitepixel, a new password hash auditing tool for Linux, and have successfully run it on 4 x HD 5970. I provide ample details on my blog: http://blog.zorinaq.com/?e=42 Performance definitely scales with the number of GPUs. A year ago or so, it seemed that even though 6 or 8 GPUs could be detected, they were unintentionally downclocked by the drivers as more were added to the system (running with 6 or 8 of them would result in the exact same GFLOPS/MIPS performance as 4 GPUs).
0 Likes

Hi,

Is very nice to hear that. Just to verify that I understand you: When you work with CAL only (generating IL code by hand); you see that scaling works well (4xHD5970 works 8x faster than one GPU of HD5970).Is that correct?

thanks a lot for sharing this information

best regards,

Alfonso

0 Likes
mrbpix
Journeyman III

Correct. I wrote IL code by hand. I compile it at run time with calclCompile. And the perf of 4 x HD 5970 is exactly 8 times the perf of 1 of the 2 GPUs of an HD 5970.

0 Likes

Interesting information, Marc. I guess you're processing 5xMD5s per thread, thus stream cores utilization getting closer to 100% (95.7% probably). Been experimenting with 5x back in January and that time I've only noticed slowdowns compared to 4x. Looks like CAL compiler really improving.

 

Anyway, I guess there no chances that you'll be able to install Windows 7 on your system, so original question stays open. I was really hoping to get some answer from ATI officials but looks like the ones who writing Windows drivers never visits these forums and one who visits have no idea about Windows drivers.

0 Likes

Yeah the CAL compiler is very good at utilizing the 5 ALUs. In whitepixel ALU utilization is actually 99.1% if my math is right.


Actually I did boot this machine into Windows 7 64-bit. (I have a pretty neat network boot iSCSI setup which I describe in the previous blog entry: http://blog.zorinaq.com/?e=41  I did try running ighashgpu. I expected it to detect some number of GPUs, but it detected only 1. Supposedly Crossfire needs to be disabled, but I cannot find any Crossfire option in the Catalyst control panel or whatever it is called. Can you send me a screenshot of where this option is supposed to be? I googled around but quickly gave up as this was too frustrating and when back to Linux to continue working on whitepixel

0 Likes

AFAIK, it's impossible to disable Crossfire under Windows. However, it isn't a problem there -- it's possible to use 5970 with active crossfire and Catalyst 10.7-10.11 under Windows using CAL/IL only. At least with one 5970 within system it works.

Once you have several GPUs more tricks required to make them recognized by CAL layer/Windows drivers -- at first you're need to attach dummy plug/monitor to each GPU. Then you're need to extend desktop to all GPUs you're want to use (Control Panel\Appearance and Personalization\Display\Screen Resolution). After that you'll have very large virtual desktop and theoretically all GPUs should be recognized by CAL. However in really it usually only possible to use 4x ones (even if more were recognized) -- everything above can't work reliably, system usually locks up after several seconds of work. Extremely annoying.

I guess if this behavior won't change with 6990 release then there zero chances it'll be fixed in nearest year.

0 Likes

I don't think there is any theoritical limit on the number of GPU's supported. But after a certain point the resource scarcity will tend to limit the number of GPUs supported.

 

0 Likes

OK, I'll forward to our customers:

 "ATI itlsef have no idea what's the maximum number of ATI GPUs supported within single system by Windows drivers".

0 Likes

Originally posted by: empty_knapsack OK, I'll forward to our customers:

 

 "ATI itlsef have no idea what's the maximum number of ATI GPUs supported within single system by Windows drivers".

 

0 Likes
ahu
Journeyman III

mrbpix, the results are really encouraging

With the latest CCC on Windows, the Crossfire option is real easy find: Performance -> AMD CrossFireX configuration. If you don't find it, it's probably disabled because of support for your configuration. Additionally on Windows, you should connect each of the 8 GPU's to a monitor or VGA dummy (at least that's a requirement on some of the driver and/or OS versions). Also on earlier drivers, you had to disable Catalyst AI, but I can't find that option anymore on the new CCC.

0 Likes

Originally posted by: ahu mrbpix, the results are really encouraging

With the latest CCC on Windows, the Crossfire option is real easy find: Performance -> AMD CrossFireX configuration. If you don't find it, it's probably disabled because of support for your configuration. Additionally on Windows, you should connect each of the 8 GPU's to a monitor or VGA dummy (at least that's a requirement on some of the driver and/or OS versions). Also on earlier drivers, you had to disable Catalyst AI, but I can't find that option anymore on the new CCC.

0 Likes

Wow great that you could fit in 4 x HD 5970,

what mainboard did you use for that?

 

Not to mention the person that got in 5 of them.

 

I have a 4 socket box here with a tyan S4985 mainboard,

on my facebook a photo of how the 1000 euro system looks like;

but it seems the 5970 is too long to fit in it as it would touch unmissable

powerplug that mainboard needs from psu (a 5 pin molex actually that sticks out).

 

So probably need order a new mainboard. Which one you advice?

 

Regards,

Vincent Diepeveen

 

p.s. in HPC scaling above 50% is good when i parallellized for SGI supercomputer. So whether you scale 95% or 98% is all very well

 

Efficient code that is useful doing something much tougher.

 

 

0 Likes

Originally posted by: diepchess Wow great that you could fit in 4 x HD 5970,

what mainboard did you use for that?

 

Not to mention the person that got in 5 of them.

 

I have a 4 socket box here with a tyan S4985 mainboard,

on my facebook a photo of how the 1000 euro system looks like;

but it seems the 5970 is too long to fit in it as it would touch unmissable

powerplug that mainboard needs from psu (a 5 pin molex actually that sticks out).

 

So probably need order a new mainboard. Which one you advice?

 

Regards,

Vincent Diepeveen

 

p.s. in HPC scaling above 50% is good when i parallellized for SGI supercomputer. So whether you scale 95% or 98% is all very well

 

Efficient code that is useful doing something much tougher.

 

 

0 Likes

mrphantuan,

Do you have some question to ask?why are you just quoting the comments of other guys from top?

diepchess,

I don't think i can help you in this matter. I hope people who have experimented with 4x 5970 can answers that better.

 

0 Likes

0 Likes
Alice_Sunny
Journeyman III

It is a worthy problem to be thorough study.
0 Likes