4 Replies Latest reply on Dec 13, 2017 3:40 AM by mex-74

    Why RX Vega 64 (air cooling) is unstable in default mode?

    mex-74

      This is my card

      MSI RX Vega 64 | TechPowerUp GPU Database

      The room temperature is 26 С.

      In balanced mode ( BIOS 1 and BIOS 2) does not pass the stress test 3D mark Time Spy after 10 minutes, the test hangs and crashes with the error "0%,test not passed".

      Firestrike stress test is "96%, test not passed". Also, RX Vega 64 hangs in many heavy games (The Witcher3 or Hellblade).

      GPU temperature  in stress tests - 83 C

        The tests 99% passed only when  the fan speed is set to 3300 rpm, or P5 is selected as maximum (1310 MHz in applications) at a fan speed of 2400rpm (default), that is, at a GPU temperature of 76 ° C
        This is on all versions of the drivers up to 17.11.4

       

      This is normal?

       

      Config:

      Intel core i7-4770

      Gigabyte LGA1150 GA-Z87X-D3H

      Enermax 850W

      4х8GB Kingston HX318C10FBK2/16

      SSD SAMSUNG 850 Pro

      AMD RX Vega 64

      X-fi titanium HD

      64bit win10 fall creators update (1709)

        • Re: Why RX Vega 64 (air cooling) is unstable in default mode?
          amdmatt

          Please try the following settings in Wattman and let me know if it passes the test.

           

          +50% Power Limit

          Set the GPU fan speed to maximum RPM.

           

          The Firestrike stress tests fail if the FPS fluctuates too much between runs. As the air cooled Vega heats up, GPU clock speed may be lowered if temperatures are not kept in check. Increase powerlimit+fan speed should help stop such fluctutation and then you my pass this test.

           

          If your GPU is working normally in games, it is not faulty.

          1 of 1 people found this helpful
            • Re: Why RX Vega 64 (air cooling) is unstable in default mode?
              mex-74

              +50% Power Limit

              maximum RPM

              AMD Radeon RX Vega 64 video card benchmark result - Intel Core i7-4770,Gigabyte Technology Co., Ltd. Z87X-D3H-CF

               

              +50% PL 4900rpm time spy extreme stress test.jpg

                GPU Temperature - 82

               

              Balanced (default)

              +0% Power Limit

              2400 RPM

              AMD Radeon RX Vega 64 video card benchmark result - Intel Core i7-4770,Gigabyte Technology Co., Ltd. Z87X-D3H-CF

              0% PL 2400rpm time spy extreme stress test.jpg

              GPU Temperature - 83

               

               

              I have the assumption that at a speed of 2400rpm there is poor heat transfer and the hottest point of the GPU overheats the HBM memory through the common surface of the cooling system. Overheating of HBM leads to the hang of the image in stress tests after a certain time and in games (1 hour, 2 hours, depending on the type of game). 

               

              Screenshot from the GPU-z during the Superposition test in the Balanced mode

               

              0% PL 2400rpm superposition.jpg

               

              Please, tell from the results, how dangerous it is to use my graphics card in the default mode (balaced mode)

                • Re: Why RX Vega 64 (air cooling) is unstable in default mode?
                  amdmatt

                  You passed, your GPU is fine. :_

                   

                  Temperatures are safe. If you want lower temperatures, decrease state 6/7 voltage (try it in 0.050 increments) and increase fan speed.

                  1 of 1 people found this helpful
                    • Re: Why RX Vega 64 (air cooling) is unstable in default mode?
                      mex-74

                      The cause of the stopping (0%) of the "time spy stress test" is found!

                      This is the hard temperatures HBM2 in the balanced mode (2400rpm) - "95 C" for the clock frequency "945Hhz".

                      At 800Mhz HBM2 in balanced mode, the test runs without stopping (0%) at the same temperature of "95 ° C".

                      Also, I found that memory can run at higher clock speeds (> 1020Mhz) only at temperatures "<60 C".

                      So I got the dependency:
                      1020-1100Mhz <60C
                      945Mhz <85C
                      800Mhz <95C
                      At higher temperatures, memory is unstable.

                       

                      It is not possible to cool the "HBM2 only" on the reference cooling system, the problem is solved when GPU is P5 (max) and 3300rpm. With these settings, HBM2 (945 Mhz) - 85C,  GPU (1310 Mhz) - 75C, and tests are not stopping  and pass with a stability of 99.2%.

                      Undervolting does not solve the problem, the load is increased and the chip is heated, it is impossible to choose the volt settings at which 2400 rpm would support the GPU 75 C and HBM 85 C and correct performance.

                      It is very sadly that the reference video card is so designed that it is necessary to increase the noise and reduce the performance so that the card can work stably.