cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

pwvdendr
Adept II

maximum number of GPUs?

Is there a maximum number of GPUs that can be used for OpenCL computing?

I heared there is a hard limit of 8 with the current linux drivers, but I'm not sure on windows. Technically it's not so difficult to build a compute cluster consisting of 7 dual GPUs on a single mother board (say 7x HD6990 or soon even 7x HD7990). But I just want to verify that the drivers will actually support this under windows, since I heared they don't support it under linux (only up to 8).

54 Replies
Skysnake
Adept II

How far i know, there is still the problem with to few PCI-E adress bits.

But i am also very interested in a solution. How far i know aktive PCI-E riser cards could be a solution

0 Likes

Could you give a reference or more details? I know you need PCI-e extender cables to connect them, for space reasons (is this the same as riser cards?) but I don't see how this would affect address bits in any way.

0 Likes
davibu
Journeyman III

Up to now, I have never seen anyone use more than 8 GPUs on a single system (8xSingle GPU cards or 4xDual GPUs cards). You can check LuxMark results database for few examples: http://www.luxrender.net/luxmark/top/top20/Sala/GPU

I assume it is an hardware limit (not a software one).

0 Likes
jross
Adept I

I don't believe it's limited by  PCI address bits (at least not within reason).  I think the only hardware limit you have to worry about is BIOS memory space.  I could be wrong, but I believe graphics devices grab a larger chunk of bios memory than, say, your USB controller.

It is a software issue.  People have made several attempts at adding more than 4 dual-GPU or 8 single-GPU cards to a single machine without much success.  Until you can demonstrate to AMD that they will be making a lot of money in the 8+ GPU workstation business, they're probably not going to dedicate driver developer time to it.

Windows and Linux (X Server) could also have software issues.  Hardly anybody ever tries 8+ GPUs.

However, I don't think we've ever received a straight answer from AMD on whether they're working on the support.  I'm still very interested in hearing if it's possible.

0 Likes
Meteorhead
Challenger

It is true. GPU cards take hold of 256MB of BIOS memory space, and if you've got 8 GPUs, that's 2048MB just for GPUs. You have SATA controllers, sound card, ethernet controller, USB hub... and many other things that take up BIOS memory. In general it is almost impossible to free up more memory for a 9th GPU, but definately not 10th.

Only solution would be to create 64-bit BIOS, or solve the issue of GPUs taking up so much BIOS memory.

We would've built such a serious machine (16-32 GPUs per node), but it's impossible.

0 Likes

Sorry guys,

you are all wrong. Look here http://fastra2.ua.ac.be/

13 GPUs in one machine

I try to talk since 1,5 year with AMD over such a project with AMD GPUs, but i have no luck Not at the 2011 CeBIT nor on the 2012 CeBIT. I have seen the AMD guys, but they looked pretty busy at the XFX exhibition stand

So if anybody from AMD read this, send me a mail, if you are interested

0 Likes

I have to say: excellent work done Skysnake! I have read through the specs and technical detail stuff, and it looks real neat. The only thing I ma unsure of, how can flexible PCI-E risers solve that if you screw cards into the proper place on the back of the chassis, then the PCI-E connectors of the videocards will actually insert into a proper slot on the board, and there is no room for the flexi riser. If you could make a picture of that (or explain) how that's done, that would be cool.

We are in the design phase of creating similar, GPU-packed machine, and the most suitable board (the well known TYAN board) is dropped due to bad experiences with support and QoS from partners of ours. This ASUS motherboard is among the best for single CPU solutions, but it cannot hold enough RAM and it doesn't have enough "real" PCI-E lanes. Dual socket boards that hold lot of PCI-E slots is extremely rare, and also it would be nice to have 100+GB RAM for more than 12 GPUs. It would be interesting to have more VRAM then RAM.

It sounds somewhat disencouriging to hear that AMD is not interested in building such a machine. It would show nice on just any website to show that it is reality to build such beasts. Our institute would have applications that could utilize such single-node/many-GPU workers. We would consider using PCI-E extender boxes (CUBIX), but the very same problems arise that arised in the FASTRA I-II.

Skysnake, it is mentioned on the site that it is not verified that the I/O port space is really neccessary but is not verified. It would be good to know, since it seems that is the only reason the 13 GPU limit exists. (Although I don't know how can all this be put together with fglrx, but I fear a lot more complications would arise, but I may be wrong) With PCI-E 3.0 coming about, 32 GPUs is still viable from a bandwidth point of view.

0 Likes

Sorry, it is not my project so no pictures from me, but you can find the picture you search on the website

http://fastra2.ua.ac.be/?page_id=38

http://fastra2.ua.ac.be/wp-content/gallery/fastra2/thumbs/thumbs_IMG_0422.JPG

The Riser cables are just for physical reaseons. You cannot put 2 Slot cards into 1Slot space   At the Fastra2 project they solved the BIOS/Driver problem with a single card that is different to the other one. You use this card for booting.

And since this night, i perhaps know how you can solve the booting problem easely But i have to confirm this in the next days/weeks...

But then still it is not clear for me, if there are some more driver/BIOS problems and so on. I don´t think that there will be a OpenCL problem, but you will never know until you do it.

Meteorhead schrieb:

We are in the design phase of creating similar, GPU-packed machine, and the most suitable board (the well known TYAN board) is dropped due to bad experiences with support and QoS from partners of ours. This ASUS motherboard is among the best for single CPU solutions, but it cannot hold enough RAM and it doesn't have enough "real" PCI-E lanes. Dual socket boards that hold lot of PCI-E slots is extremely rare, and also it would be nice to have 100+GB RAM for more than 12 GPUs. It would be interesting to have more VRAM then RAM.

Can you say, what you are doing? I am just a physics/computer science student at the University of Heidelberg.

I know a solution for your problem, but the product is not released yet, so i am not able to say something about this, but you should have 7-9 PCI-E 16x slots and more than enough bandwith and DIMM-Slots. I have talked with the company at the CeBIT, and they could give me perhaps such a board, but then i still need 2 CPUs for something about 500-1k each -.-

100+ GB is not possible. How far i know 8 PCI-E Slots are the maximum and so eaven if you use Dual-GPUs (7990 what is not released now) with the maximum of 6 GB RAM, you will need a 9th card for this. But 8 could be possible, but very very very expensive. (more or less not FirePro/Tesla )

It sounds somewhat disencouriging to hear that AMD is not interested in building such a machine. It would show nice on just any website to show that it is reality to build such beasts. Our institute would have applications that could utilize such single-node/many-GPU workers. We would consider using PCI-E extender boxes (CUBIX), but the very same problems arise that arised in the FASTRA I-II.

Yeah, at the 2011 CeBIT, AMD was only in the Reseller area, and they don´t let me in -.- So i have only a card with a email adress, but absolutely no response from there. So i have talked this year with there board partners, and they are much more interested. A BIG thanks! to XFX at this position. They sponsored me a card, so i am able to have a look at the ne GCN architecture.

Skysnake, it is mentioned on the site that it is not verified that the I/O port space is really neccessary but is not verified. It would be good to know, since it seems that is the only reason the 13 GPU limit exists. (Although I don't know how can all this be put together with fglrx, but I fear a lot more complications would arise, but I may be wrong) With PCI-E 3.0 coming about, 32 GPUs is still viable from a bandwidth point of view.

Bandwith is alwas such a point. There are applications out there where 16x PCI-E 3.0 is still not fast enough and other, where 8 PCI-E 1.0 lanes are enough... So there is no easy answer to something like this. The same for the question if a HD5870 or a Tesla card is a better solution. It depends. In most cases the Tesla is faster, but there are also problems, where the Tesla only can see the back lights of the old VLIW5/4 cards. Have a look on Bit coining and so on.

I really don´t know what the problem is with the BIOS/driver, but i know, that SuperMicro use active riser cards for there 6 GPU machine. So with there solution it should be perhabs possible to use 6 dual-GPU cards, but i don´t know. I think they have never done something like this, because they see no solution for the thermal problem on one side, and the power supply in a rack on the other side.

Btw. i see no thermal problem, when you do it right, and also it should be possible to get the cards into 1 slot.

And btw. who do you call Skysnake?

0 Likes

All wouldn't be a problem if GPU vendors finally did their homework and provided drivers that work with EFI boot (without any BIOS image loading tricks).

Meteorhead wrote:

We would've built such a serious machine (16-32 GPUs per node), but it's impossible.

I don't mean to detract from what you want to do, but why not consider building a cluster with 4 nodes with 8 GPUs per node?  If your algorithm/application is not I/O intensive then a cluster seems to be to be a reasonable alternative.  And if your algorithm/application is I/O intensive, wouldn't that many GPUs sharing the PCIe bus quickly saturate it, leading to a performance bottleneck.  Plus the cost of such a specialized system  grows exponentially compared to scaling with commodity-built systems.  Sorry, maybe I'm too naive about this extreme (8+ GPUs) HPC workstation.

Check out SnuCL for an free and open-source OpenCL framework that abstracts away all the GPUs in a cluster and makes it just as easy to program as a single system.

0 Likes
Meteorhead
Challenger

Forgive me... I think you misunderstood me. It is not in my interest to have more VRAM than RAM. Infact that would complicate things, since if it would be a simulation software of some physical system, the entire system could not manifest in host memory if there were more VRAM then RAM. But if I had 16 dual GPUs let's say (let's forget about BIOS limitations for a second, bandwidth would be enough), that's 96GB VRAM. That is why it would be good to have a motherboard somewhat more serious than ASUS P6T7. I know it's very elegant to say "forget about BIOS limitations" when that's the only complication out there, and it's extremely to get around just alone this issue. However, if I understood it correctly from your posts, practically all limitations to the number of GPUs hooked up to a single host could be lifted (even with present BIOS of motherboards).

About this EFI boot... could someone enlighten me as to what the issue is or how it operates and what is this homework that should be done?

0 Likes

EFI have much more abilities and get rid of some limitations like for >3TB HDDs. So also some limitations for adressing etc. are no longer there, but as far i know, the System/Board builder doesn´t use them.

But i don´t know if this is really true. Thinks like that are very difficult to say, because nearly nobody do something like this.

It is always the chicken egg problem -.-

0 Likes

Having such a system I am not even sure whether it would make sense to have all of the VRAM mirrorred to system RAM at any point in time. But that's a different matter.

Regarding the EFI boot. AFAIK the drivers by both GPU vendors currently require a normal BIOS boot, the notable exception being their OS X drivers. When doing an EFI boot the driver will fail to operate as the expect the classical video BIOS bootstrapping to have taken place. There is some hack to this from EFI, but nothing I have seen working reliable.

0 Likes

But it is possible, how we see with FASTRA II. The problem is only, that it is not usable for everybody.

0 Likes

So do I take it correctly, that EFI boot is something similar to what FASTRA II has done, but in a more standardized way? That EFI does not take care of initializing devices, rather then leaving that to the drivers themselves?

Mirroring the contents of VRAM to RAM can be avoided in most cases, but it is a hassle to work around in most cases, and in rare cases not possible at all.

0 Likes

I believe the FASTRA team modified the Linux kernel so that it ignored (or rather extended) the information that the BIOS handed to the Linux kernel.  Basically, the BIOS initialization was wrong and they manually corrected it.  It's a level of kernel hacking where most people cannot or wont go.

I'm not familiar with graphics bootstrapping via EFI.  If it operates correctly for 8+ GPUs, then there should be no need for a Linux kernel patch.

The VRAM/RAM issue is separate from the rest.  Find a system that will support enough 8GB or 16GB DIMMs if it's an issue.  You're probably out of luck with the ASUS P6T7.

0 Likes

Just wait a while for the Dual-Sockel 2011 Boards. There will be some with lots of PCI-E 3.0 slots and lots of DIMM-slots also. The RAM Problem is definitly no problem.

Would be great if someone of the AMD guys could write something here.

0 Likes

This thread was singled out in the AMD Developer Central Newsletter yesterday as "one of the popular topics being discussed right now" and in this forum and "you'll notice the AMD moderators to be more responsive".

Out of all of the threads where 8+ AMD GPUs has come up for discussion in these forums, I don't believe the community has received input from AMD.  We realize Skyrim multi-GPU driver support is more of a priority, but it would be a really big psychological and marketing win if AMD had a customer show off a 10-20 Tahiti GPU machine that could be used to smack around the competition.

0 Likes

jross,

This issue has been brought up with people internally, so it isn't ignored, it just isn't something we can discuss publicly.

Nice to hear.

I hope AMD involve the interested people as soon as possible, even if there is a NDA requirement.

0 Likes

Thanks, Micah.  I'm glad to hear that.  Perhaps we could discuss this confidentially.

0 Likes

Hi,

since i have some positiv informations, i started to get support for testing a 7 GPU System. It looks like i get everything for this, i want to ask, if AMD would support me to look after the usability/stabebility of such a system.

My maintarget ist to fit everything into a single case with default hardware. So no PCI-E risercables and so on. It would be a great help if AMD could give some hardwaresupport, because of the GPU shortage, it is really difficult to get support from the Boardpartners.

It would be great if AMD could contact me.

0 Likes

Have you some new informations for us?

nVidia have announced there new Tesla cards yesterday. I hope we hear something about the FirePro very soon.

Also they named a GPU with up to 4 GPU-DIEs on a single PCB.

So i really hope, we hear something about the Mutligpu support situation from AMD.

I have now everything together to write a introduction how to build a 24/7 7 GPU system that fit into a single Case. Only 6 7970 left until now

Would be great if i could hear something from you.

0 Likes

it would be a really big psychological and marketing win if AMD had a customer show off a 10-20 Tahiti GPU machine that could be used to smack around the competition.

Not 10-20, but I'm currently building one with 8x HD7970, given the limit discussed here. Machine is as good as ready, I'm just looking now for the best offer to buy 8 of these cards to plug in.

Yes, 8x HD7990 would have been more awesome but hey, 8x HD7970 is already really, really powerful... and beats FASTRA II in all domains (hardware cost, speed, power consumption and bios issues). Then again, that's more AMD's merit than mine.

But I can say that in the academic world, there's quite some market for large GPU computers. Many scientific algorithms/simulations - whether it'd be statistics, biology or engineering - would need serious adaptions to make them run across multiple computers, but are nevertheless fully data-parallel and could easily run on any number of GPUs on the same machine. But until these easy adaptions are made, our university decided to spend another 1M euro all together, into a new large CPU-based supercomputing cluster, which will mainly be used for data-parallel tasks...

GPGPU support for scientific applications is slowly starting (I read last month that MATLAB now has GPGPU support), but it's now up to AMD to make sure that they don't get CUDA'd again this time.

0 Likes

Could you send me some informations about what you have done to make the machine operating and so on?

I would be very happy about this informations.

0 Likes

Something new pwvdendr?

I am heavily interested in your work.

0 Likes

Once the machine is up and running and has survived the testing stage, I'll be happy to provide elaborate information, but of course I want to make sure first that no further complications arise.

The machine should have been up and running weeks ago, but the Belgian customs clearance is not doing their job well and I'm waiting now for nearly 3 weeks for the last parts. They're stuck at the Belgian customs clearance since March 26...

So until these people finally do their job, you and I will need some more patience.

0 Likes

Poor man. Something like this sucks really. 😕

I hope you get very soon what you need. But could you send me some informations about what you have done to make more than 4 GPUs in one System work? If it is possible please send me a mail on to my university mailadress.

Thanks you very much.

If there is a NDA or something like this, i respect this, even if i hope, that you can give me some hints, so that i can find myselfe the solution.

0 Likes

They have arrived meanwhile

More than 4 GPUs? But what is the problem with that, e.g. 5 GPUs works out of the box in every instance that I tried it? What problem are you hitting upon?

And I had to look up what an NDA means, don't worry, I'm just a science PhD student like you.

0 Likes

Just read the topic here and the link to the Fastra project, than you see, that there are some kind of problems

But yeah, it seems, that it is not so hard to make up to 6 GPUs work, but i don´t know how it looks beyond 6. There is no Machine out there you can buy with more than 6 GPUs.

0 Likes

The windows drivers seem to contain a bug, which prevents all GPUs from working, but my topic about it doesn't seem to catch any answer.

Under linux, 8 GPUs worked, and I had not more issues then I usually have when trying to run linux. Then again, I really don't like working linux-only, and I strongly hope that the issue with the windows drivers will be fixed soon.

0 Likes

very interesting.

So you have NO problems with linux?

Can you post you hardware? I think you will have a UEFI Systems. Perhaps the MB facturers have solved the problem without the knowledge of it.

Have you build the linux bye yourself?

Just post as much informations as you can please. To solve your problem without any informations is not possible.

On the other hand. How have you solved the physical fitting problem? PCI-E riser cables, or single-Slot desing?

Have you checked, if you can use with a OpenCL program all the 8 devices? If yes, please let it know. Be shuhre, that you use all 8 devices please at the same time.

0 Likes

So you have NO problems with linux?

Several, but all solvable with my limited linux knowledge (just following some good guides). I wrote the steps for installing the drivers here. It's tedious, I've had to re-install a few times for unclear reasons, but in the end this simple instruction set works to get everything up and running.

Can you post you hardware? I think you will have a UEFI Systems. Perhaps the MB facturers have solved the problem without the knowledge of it.

The motherboard (MSI Big Bang Marshall B3) does clearly struggle with the cards. When all 8 are attached, for example, I cannot enter the bios. But the machine does boot up (this only breaks at 9+ cards iirc). I'm writing an article about it with all the hardware, but the only relevant ones for the windows driver problem I mention seem to be the GPUs themselves, which are 8x HD7970.

Have you build the linux bye yourself?

Nope, stock ubuntu 12.04 beta2 with all default settings. But no updates. Updating broke it.

On the other hand. How have you solved the physical fitting problem? PCI-E riser cables, or single-Slot desing?

PCI-e extender cables with external power support (since the MB is not supposed to be able to feed 8x75W and you might fry it with unsupported extenders). But details like this will be in the article I'm writing about it.

Have you checked, if you can use with a OpenCL program all the 8 devices? If yes, please let it know.

Of course I have. Computations have been up and running all weekend.

0 Likes

Great News!

So i have to call some guys i know, and inform them, that a equal project now can start

Linux should be OK for me. The leak of windows hurts, but it is not such important.

So perhaps i can saw also something in the near feature about a 8 GPU System

If you have a nice application, it would be nice if you could post something about it.

Also please let it us know, when you have finished your paper. I really interested to read it

0 Likes
Skysnake
Adept II

Ok, the FirePro are anounced. Are there any new informations about this "multi-GPU" project from AMD???

I am still here, and would test everything, write a how-to and so an, if AMD would provide me with the GPUs.

I hope we hear very soon something about this AMD project. XeonPhi is now paper"launched", and GK110 will be there at Q4 this year or Q1 2013.

0 Likes
Starglider
Adept I

I am running a grid compute team at a large company and I am trying (once again) to write a business case for using AMD GPUs in the next hardware tranche. It is difficult to justify because of pointless driver limitations. If we could plug eight 7990s into a Tyan barebones and have all 16 GPUs actually work reliably in OpenCL, then I would place an order for 20 nodes (i.e. 320 GPU sales for AMD) immediately. As usual though, AMD have decent hardware but utterly defeat themselves in the professional market with hopeless drivers. There is the pointless and unnecessary 8 GPU limit (which Nvidia cards don't have), the pointless and unnecessary crippling of OpenCL on the 5970, 6990 and most likely 7990 as well (second GPU is unusable for compute due to crossfire locked on - fix promised for the last three years - never delivered - meanwhile Nvidia 590 GTX works just fine), and the bizarre performance drops for even modest multi-GPU setups (again, Nvidia has no such issue).

If AMD could fix even just the 8 GPU limit then we could make a reasonable case that double density (16-GPU) AMD nodes installed now will still have superior price/performance to a standard (8 GK-110) Tesla nodes available later this year, and go ahead with the purchase. Being able to use double-GPU cards to avoid the engineering time of installing PCI-E extenders and splitters would make the case quite compelling. Unfortunately several years of complete AMD indifference to these issues suggests that they will continue to go unresolved and we will end up buying more Nvidia as usual. That's about half a million $ a year from just one customer that could go to AMD, but goes to Nvidia instead because of driver shortcomings,

Hi,

Are you saying that OpenCL is not working on the second GPUs of thouse double-chip cards?

I'm curious because I've never tested my programs with OpenCL only on multiple single-chip cards.

0 Likes

It is working, just not officially. I have not found a single use-case where it actually produced incorrect results. But if someone has such a case, please (re-)post it, I might have overlooked. We have multiple 5970s in a single machine and they work fine.

0 Likes

Yeah, that is right, but again, AMD say: Yeah we do something" and than you hear nothing from them....

I really can't understand what they do.

0 Likes