I cannot no longer see the whole original topic I've created:
As some xml tags spawns in the middle and Opera and IE9 refusing to show page properly, so I'm re-quoting everything here:
Originally posted by: empty_knapsackYear ago it wasn't possible to use more than 4. I've seen some reports about working 3x5970 configs since then but under Linux.
What's about Windows? Was 4x limit been increased finally?
One of our users trying to setup 3x5970 system and even with dummy plugs/monitors attached to each of 5970 DVI ports it fails to operate properly -- system simply lock ups after several seconds of work. There no problem with PSU, so this isn't the case.
Originally posted by: himanshu.gautamI can confirm Working with 8 GPUs in a system on both linux and windows platform.
Originally posted by: empty_knapsackCan you please provide more details about this system?
What motherboards was used, what GPUs? Which version of Catalyst was installed, are there any dummy plugs were used to make all GPUs recognized by Windows? Which software you've used to make sure that all 8 GPUs in fact works and producing correct results?
Originally posted by: himanshu.gautamempty_knapsack,
The Linux system had following configuration:
OS: Red Hat 5.3 64-bit
ATI driver: Catalyst 10.4
CPU: Two Quad-Core AMD Opteron
Chipset: NVIDIA nForce Professional 3600 and 3050
RAM: 32 GB DDR2
Originally posted by: afoHi,
It's nice to hear that, but this raises a question: what performance do you have compared with 8 GPUs in 8 different computers? And as others asked: what boards were used?
Originally posted by: himanshu.gautamempty_knapsack,
We plugged 8 monitors to these GPUs. CAL's FindNumDevices as well as OpenCL showed 8 GPUs. We also ran a OpenCL reduction program that used all 8 GPUs and got correct result
Now, as Alfonso, I'm very interested to know which GPUs were used and are they working simultaneously for at least several seconds (not as most CAL/OpenCL samples which works just milliseconds) and all producing correct results?
Also my original post was related to Windows (though Linux support is also interesting) and I'm quite curious which drivers you've used for it, as Catalyst 10.4 was completely broken for Windows and 5970. 2nd core was detected and even can be used for calculations but it runs much slower and produced just garbage instead of real results.
Also, are you absolutely sure that OpenCL test you've used was correct? I'm asked this because earlier you was suggesting to use SimpleMultiDevice /e as test:
but later it ends as this test reports OK for any results because of programming error inside.
This is just my tests and guessing, so take it carefully...
I compared the performance of SDK2.1 and 2.2 using multigpu (2xHD5970) And I see that the differences in performance are related to the way that the OpenCL runtime manages the threads for different GPUs (kernel time is the same for different SDKs). For example, the cpu load for sdk 2.1 is lower than cpu load for sdk 2.2, but the performance for sdk 2.1 is better than the performance of sdk 2.2 for the same multigpu application (both giving correct results for the 4 GPUs)
Besides that, page 2-4 of the OpenCL programming guide states that the different queues for each GPU merge into one queue for all GPUs, so the scheduling of that queue would have a very big impact in performance for multigpu. For example, does that queue reorder the items to group memory transfers or kernel calls for a specific GPU? that has a impact in multigpu performance, so knowing/controlling the scheduling of the GPU queue would allows us to improve the performance of multigpu applications.
Is very nice to hear that. Just to verify that I understand you: When you work with CAL only (generating IL code by hand); you see that scaling works well (4xHD5970 works 8x faster than one GPU of HD5970).Is that correct?
thanks a lot for sharing this information
Correct. I wrote IL code by hand. I compile it at run time with calclCompile. And the perf of 4 x HD 5970 is exactly 8 times the perf of 1 of the 2 GPUs of an HD 5970.
Interesting information, Marc. I guess you're processing 5xMD5s per thread, thus stream cores utilization getting closer to 100% (95.7% probably). Been experimenting with 5x back in January and that time I've only noticed slowdowns compared to 4x. Looks like CAL compiler really improving.
Anyway, I guess there no chances that you'll be able to install Windows 7 on your system, so original question stays open. I was really hoping to get some answer from ATI officials but looks like the ones who writing Windows drivers never visits these forums and one who visits have no idea about Windows drivers.
Yeah the CAL compiler is very good at utilizing the 5 ALUs. In whitepixel ALU utilization is actually 99.1% if my math is right.
Actually I did boot this machine into Windows 7 64-bit. (I have a pretty neat network boot iSCSI setup which I describe in the previous blog entry: http://blog.zorinaq.com/?e=41 I did try running ighashgpu. I expected it to detect some number of GPUs, but it detected only 1. Supposedly Crossfire needs to be disabled, but I cannot find any Crossfire option in the Catalyst control panel or whatever it is called. Can you send me a screenshot of where this option is supposed to be? I googled around but quickly gave up as this was too frustrating and when back to Linux to continue working on whitepixel
AFAIK, it's impossible to disable Crossfire under Windows. However, it isn't a problem there -- it's possible to use 5970 with active crossfire and Catalyst 10.7-10.11 under Windows using CAL/IL only. At least with one 5970 within system it works.
Once you have several GPUs more tricks required to make them recognized by CAL layer/Windows drivers -- at first you're need to attach dummy plug/monitor to each GPU. Then you're need to extend desktop to all GPUs you're want to use (Control Panel\Appearance and Personalization\Display\Screen Resolution). After that you'll have very large virtual desktop and theoretically all GPUs should be recognized by CAL. However in really it usually only possible to use 4x ones (even if more were recognized) -- everything above can't work reliably, system usually locks up after several seconds of work. Extremely annoying.
I guess if this behavior won't change with 6990 release then there zero chances it'll be fixed in nearest year.
I don't think there is any theoritical limit on the number of GPU's supported. But after a certain point the resource scarcity will tend to limit the number of GPUs supported.
OK, I'll forward to our customers:
"ATI itlsef have no idea what's the maximum number of ATI GPUs supported within single system by Windows drivers".