17 Replies Latest reply on May 22, 2011 10:03 AM by galmok

    Proper GPU workstation

    Meteorhead
      any intention?

      Dear All!

      From time-to-time I look around the market to see if there are any pure AMD based workstation solutions or just standalone pieces of HW that can be put together for something capable.

      I am somewhat puzzled why it is, that there are practically none AMD based workstations, and it's impossible to make one. Intel and NV has several good solutions, the nF200 chip is extremely useful for splitting PCI-E lanes (create two x16 slots from one, both working at x16, but they share the 16 lanes, so when both are running, they compete for bus bandwidth.) Naturally the nF200 is non-existent on AMD chpiset motherboards, but what is more surprising is that nothing of the like exists on AMD motherboards.

      If I wish to create dense AMD cluster, I would need tonns of host machines, I could not have any hosts with 4-6-7 PCI-e x16 slots each of them running at at x16, even if they are split lanes competing with each other. FX990 chipset has 32 lanes, and always they are put as x16/x8/x8/x4, there are no x8/x8/x8/x8 motherboards, not to mention x16/x16/x16/x16 with an nF200-like solution.

      Most likely there is no time anymore to develop something like this for the FX990 chipset, but next chipset will most likely feature PCI-E 3.0, less slots required for the same bandwidth... to put it short, are there any plans to create something, so that one could hook up 4-6 dualGPU cards to a single host with decent bandwidth?

      ps: Anyone know of PCI-E extenders that have some lane-splitting capability? Or any that are capable of creating two slots from one?

      http://www.cubix.com/product/gpu-xpander-rack-mount-16

      This Cubix solution looks really good, but I do not know how they achieve one host card to create two slots in the extender rackmount. Is it really that special/custom-made that one cannot find it standalone in the market? It would be favorable for many multi-gpu applications to be able to have as many GPUs comunicate as possible without the use of LAN. Enterprise animation studios, research centers, etc. could benefit from such solutions. This isn't black magic though, one could build it at home, I just cannot find the means to split slots and lanes. Ideas?

        • Proper GPU workstation
          nou

          what about this motherboard. http://www.asus.com/Motherboards/AMD_AM3/Crosshair_IV_Extreme/

          i should be capable of 16x 16x 8x 8x slots.

          BTW normal 16x card can work in 8x 4x and even 1x slot. there are people which **** at end normal 4x slot and put there full 16x card. and it works.

          maybe is possible split lanes without any chip. and nf200 is swith lanes so each card can comunicate with full speed but only one card at the time.

            • Proper GPU workstation
              Meteorhead

              x16/x16/x8/x8 does sound decent, but I feel asymmetric speed layout a bit of waste. Multi-GPU application speed will be determined by the slowest transfer bus, and having two slots at x16 and two at x8, I'd rather have two more slots both with x8, giving 6 slots of x8. Something like this:

              http://www.supermicro.com/products/motherboard/QPI/5500/X8DTH-6F.cfm

              This is however yet another Intel board. Having something like this inside a 2U chase for low-profile PCI-E extenders is best. There is another neat board:

              http://www.supermicro.com/products/motherboard/QPI/5500/X8DTG-QF.cfm

              This is a normal board with 4 x16 slots, but the board is proprietary and is meant for a 4U rackmount chase. Or something more standard:

              http://www.tyan.com/datasheets/d_S7025.pdf

              This is SSI EEB form factor (whatever that may be). Even with Intel it is hard to find something with minimal compromises, but with AMD platform... it is absolutely impossible.

                • Proper GPU workstation
                  rollyng

                   

                  Originally posted by: Meteorhead x16/x16/x8/x8 does sound decent, but I feel asymmetric speed layout a bit of waste. Multi-GPU application speed will be determined by the slowest transfer bus, and having two slots at x16 and two at x8, I'd rather have two more slots both with x8, giving 6 slots of x8. Something like this:

                   

                  http://www.supermicro.com/products/motherboard/QPI/5500/X8DTH-6F.cfm

                   

                  This is however yet another Intel board. Having something like this inside a 2U chase for low-profile PCI-E extenders is best. There is another neat board:

                   

                  http://www.supermicro.com/products/motherboard/QPI/5500/X8DTG-QF.cfm

                   

                  This is a normal board with 4 x16 slots, but the board is proprietary and is meant for a 4U rackmount chase. Or something more standard:

                   

                  http://www.tyan.com/datasheets/d_S7025.pdf

                   

                  This is SSI EEB form factor (whatever that may be). Even with Intel it is hard to find something with minimal compromises, but with AMD platform... it is absolutely impossible.

                   

                  Hi Meteorhead,

                  I own a X8DTG-QF with 4x HD6990s and I am using a LianLi's PC-P80 tower chassis which is fully compatible with so called proprietary size.

                  http://www.lian-li.com/v2/en/product/product06.php?pr_index=131&cl_index=1&sc_index=25&ss_index=61

                  Hope you find it useful!

                  *beer*


                    • Proper GPU workstation
                      laobrasuca

                      as far as I know, there's some vendors working on computing server blades mixing up to 8 Opterons + up to 8 FireStream cards (up to 16Tflops of GPU power per blade), although blades with 2 FireStreams already exists. However, I don't know anything about prices and releasing dates. As rollying suggested, look for SuperMicro stuff. VA-Technologies seems to some stuff with AMD FireStream cards too. If you dig a little further you will maybe find a couple of other vendors too.

                      • Proper GPU workstation
                        Meteorhead

                        Rollyng, may I ask how you placed your cards so they do not fry each other? 6990 also takes in air from the ventillator side (just as my 3 5970s), and the cards really fry each other. I had make gaps between the cards to make room for air ventillation.

                        Unfortunately our server room policy only allows rackmount chassis, so I'll have to look for one accordingly.

                          • Proper GPU workstation
                            rollyng

                            My system is still under fairly light loading. I see that if I run Luxmark a couple of times then the fan noise starts to get loud, so I run the GPUs with default 830 MHz setting, so l think it can handle the heat issue safely.

                            I have not added extra space between the cards but I see there is a rubber pad at the back of each card that provides some "small" space when they got stacked up with each other.

                            I wonder if you really need a rack case with max GPU supported, it is only possible to go for this Tyan beast!

                            http://www.tyan.com/product_SKU_spec.aspx?ProductType=BB&pid=412&SKU=600000188

                              • Proper GPU workstation
                                Meteorhead

                                This does look intimidating indeed. I do fear a little the airflow (as I mentioned how it is unhealthy if cards are packed close to each other). I very much hope AMD will not follow the same cooler strategy as it started with 6990. The way air exits back into the chassis is very unwanted, not to mention that in a chassis like this, the enormous ventillators propelling air in the opposing direction greately deteriorate the performance of the cooling of the back chip. (Air will nt be able to exit the card in the internal direction.

                                Anyone have any ecperience in such things? (The rubber pad does sound good indeed). Could you run luxmark in interactive mode for about 30 minutes, and check what temps the adapters are once the thermal steady-state is reached? I am very curious.

                                  • Proper GPU workstation
                                    rollyng

                                    Hi, sure I can do the test but I like to know two things first:

                                    (1) Any iteration command for luxmark to run nonstop for 30 mins, since a single run of the benchmark only last 120 seconds?

                                    (2) Any aticonfig command to report the hardware temperature?

                                    Thanks!

                                      • Proper GPU workstation
                                        ED1980

                                         

                                        To display the temperature of your cores:

                                         

                                        aticonfig --odgt --adapter=all

                                          • Proper GPU workstation
                                            rollyng

                                             

                                            Hi,

                                            Here is the output while luxmark for 15 minutes,

                                            rolly@rolly-X8DTG-QF:~$ aticonfig --odgt --adapter=all

                                            Adapter 0 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 88.00 C

                                            Adapter 1 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 94.00 C

                                            Adapter 2 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 97.50 C

                                            Adapter 3 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 85.00 C

                                            Adapter 4 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 92.50 C

                                            Adapter 5 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 80.00 C

                                            Adapter 6 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 92.50 C

                                            and the temperature while idle,

                                            Adapter 0 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 60.00 C

                                            Adapter 1 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 60.50 C

                                            Adapter 2 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 68.50 C

                                            Adapter 3 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 55.50 C

                                            Adapter 4 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 52.50 C

                                            Adapter 5 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 41.00 C

                                            Adapter 6 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 55.50 C

                                            Adapter 7 - AMD Radeon HD 6990
                                                        Sensor 0: Temperature - 44.00 C

                                            I have yet to test Furmark because I have not installed wine.

                                          • Proper GPU workstation
                                            Meteorhead

                                            And as for luxmark, there is the benchmark mode, and the interactive mode. Interactive mode runs indefinately until you quit. Or if I'm not mistaken FurMark exists for linux also, but luxmark will be just as good.

                                • GHD locks STRAIGHTENERS
                                  Meteorhead

                                  (ToBeDeleted: I cannot believe someone is advertising on a dev forum, furthermore a HAIRSTRAIGHTENER in a DEV FORUM!!! Boy, what an own goal...)

                                  About the temps, that is exactly what I was talking about. such temperatures are intolerable in a production workernode or simply a machine used for computation, not development. 90-97 degrees is just too much to run for a long time. (And by long time I mean a machine under constant 100% load) Cards inserted right next to each other just don't seem like a good geometry for cooling.

                                  AMD, please-oh-please-oh-please when you make 7990 cards, do not make the cooler fully double width! The air out part at the back could be fully double width, but please design a cooler that can be used in servers. (Strictly front-to-back airflow, and either let the cards take in air solely from the front (and not from the ventillator side heated by the neighbouring card), or leave a gap between them to let cool air be blown in between) Neither the 5970 has server suited cooling, neither does the 6990. This is an enourmous mistake. Present cooling doesn't really let CF5970, or CF6990 either, soe even gamers would welcome a little gap in between or true front-to-back airflow.

                                  And to rollying. I had the same issue with my 5970s, I unscrewed two of them from their places, and inserted a screw inbetween the cards. A little tension builds up, but load temperatures dropped from 99C to 80C. 20 degrees lower operation temperatures increase the lifetime of the card from 4 months to 3 years roughly. 80C is still unhealthy in the long run. I would welcome at least 75C for operational temps of cards inserted right next to each other.

                                  ps: FurMark crashes for me under wine, but take a try. These numbers are good enough to make my point.

                                    • GHD locks STRAIGHTENERS
                                      nou

                                      well all radeons are designed for gamer so they don't consider that someone will get four 6990 and put then into one case. you can also see that on new crossfire matherboard you can put two card one to the top and second at bottom.

                                      maybe you should consider water cooling like this one? http://www.techpowerup.com/144010/PowerColor-Readies-Radeon-HD-6990-LCS-Water-Cooling-Ready-Graphics-Card.html

                                        • GHD locks STRAIGHTENERS
                                          Meteorhead

                                          I have thought about using watercooling, the only problem with that is that I have no experience using it. I do not know how two watercooled GPUs fit into slots that are not neighbouring, or if I can fit them right next to each other (some watercooled GPUs have single slot backsides with all video connections in one row, allowing them to be placed right next to each other such as the one you showed)... And I am no system builder. I am a physicist, and as such I live off money I earn from applying to national funds of research. One claims what he/she wishes to achieve and how much money is needed for that. I do not have the liberty of playing around with equipment of these prices, such as buying 8 watercooled 6990s and fit them into that Tyan beast with some watercool installed. I do not know how much hassle it is to disassemble a machine that is watercooled, naturally it's not as simple as a normal videocard which you can just plug-in and out.

                                          I know these things are designed for gamers, but AMD could design something in between, same "sweet spot" strategy as with the dies themselves. If AMD has passive cooled Teslas, I'm sure AMD could come up with something similarily surprising and functional.

                                          I found seperate 3U chase (unfortunately no link at my hand) which is a complete watercooling system with water intake and output (enormous radiator inside). Those would be very useful, but I do not know how easy is to cut off the radiator from the machine if maintenance is required.

                                          Could some AMD employee refelct as to how impossible such a request is? (concerning thermal design for dualGPU solutions)

                                        • Proper GPU workstation
                                          rollyng

                                           

                                          Originally posted by: Meteorhead (ToBeDeleted: I cannot believe someone is advertising on a dev forum, furthermore a HAIRSTRAIGHTENER in a DEV FORUM!!! Boy, what an own goal...)

                                           

                                          About the temps, that is exactly what I was talking about. such temperatures are intolerable in a production workernode or simply a machine used for computation, not development. 90-97 degrees is just too much to run for a long time. (And by long time I mean a machine under constant 100% load) Cards inserted right next to each other just don't seem like a good geometry for cooling.

                                           

                                          AMD, please-oh-please-oh-please when you make 7990 cards, do not make the cooler fully double width! The air out part at the back could be fully double width, but please design a cooler that can be used in servers. (Strictly front-to-back airflow, and either let the cards take in air solely from the front (and not from the ventillator side heated by the neighbouring card), or leave a gap between them to let cool air be blown in between) Neither the 5970 has server suited cooling, neither does the 6990. This is an enourmous mistake. Present cooling doesn't really let CF5970, or CF6990 either, soe even gamers would welcome a little gap in between or true front-to-back airflow.

                                           

                                          And to rollying. I had the same issue with my 5970s, I unscrewed two of them from their places, and inserted a screw inbetween the cards. A little tension builds up, but load temperatures dropped from 99C to 80C. 20 degrees lower operation temperatures increase the lifetime of the card from 4 months to 3 years roughly. 80C is still unhealthy in the long run. I would welcome at least 75C for operational temps of cards inserted right next to each other.

                                           

                                          ps: FurMark crashes for me under wine, but take a try. These numbers are good enough to make my point.

                                           

                                          How annoying to have such adv posts btw technical discussions

                                          OK I did installed Furmark 1.9.0 with wine 1.3.20 but it seems to stress only one GPU not all of them? It runs fine but I cannot make use of all GPUs.

                                          Rerunning Luxmark shows temp up to 102 degree C... I must admit that extra air circulation is needed to my chassis and I am adding couples of BS-06 to it soon.

                                          http://www.lian-li.com/v2/en/product/product06.php?pr_index=233&cl_index=2&sc_index=34&ss_index=83

                                            • Proper GPU workstation
                                              Meteorhead

                                              Extra aircirculaltion doesn't help much if there's no space between the cards. Problem is that one card takes in air from the ventillator side that is heated by the neighbouring card. You cannot cool a card with 90 degrees air. Windows installation of Furmark adds two icons: FurMark and FurMark multi-gpu.

                                              LuxMark proves fine enough how cooling for long runs is insufficient.