cancel
Showing results for 
Search instead for 
Did you mean: 

Graphics Cards

spencerosity
Adept I

5700XT Crashing

Hello All,

 

I know this is an issue that has popped up on here before, but the solutions that seemed to have worked for other folks haven't helped me much. I've heard nothing but horror stories about Gigabyte's RMA process so I'm hoping y'all can help me figure this out.

 

System Specs:

MoBo: MSI Pro Z690-A DDR4

RAM: 16GB G Skill Aegis 3200 MHz

CPU: Intel i7-12700KF

PSU: Seasonic Focus 750W Gold

OS: Windows 10 Home 64-Bit

GPU: Gigabyte 5700XT

 

The issue is that intermittently when playing games the primary monitor will go black and the secondary monitor will go green and the system completely crashes requiring a power cycle. I've tried this card on 2 separate computers (albeit with the same power supply.) and the crash occurs on both. On one system however the black screen is accompanied by a looping audio click, but the newer system with the specs above, is silent. These crashes are inconsistent, sometimes occurring  every few minutes, sometimes not crashing for a couple weeks. The only pattern that I've been able to find is that the crash doesn't occur when using Vulkan API. Both R6: Seige and Dota2 will crash semi-regularly when using DirectX, and I don't believe either have crashed when using the Vulkan API. The crashes don't appear to be tied to load on the card, I have not been able to replicate the crash when using the AMD stress test or furmark, and the card will sometimes crash when running non-demanding games like Hollow Knight. I am running the most current AMD drivers, I have tried updating the BIOS on both motherboards that I have tested the card on, I have used DDU in safe mode to remove old drivers and install them fresh, I have tried a clean windows install. I have tried setting a more aggressive fan curve, although the card is not over heating. I have tried using device manager just to install the driver package and not install the adrenalin software. Windows event viewer shows errors and critical errors regarding system reboots without the system having shut down properly, but nothing else. Any help is greatly appreciated!

0 Likes
20 Replies
Geforsikus_2021
Challenger

Alternatively, try updating the bios of your video card, most likely it will help, since you tried the card on another PC and there were the same symptoms, I think it should help, the main thing is not to confuse the model of the video card, and preferably make a backup of the bios of your video card, and then fill in a fresh one from the manufacturer's website, if there are such. https://www.techpowerup.com/vgabios/?architecture=&manufacturer=Gigabyte&model=&interface=&memType=&... = you can also search here) Gpu z utility will help determine the BIOS version of your graphics card. And a firmware program https://www.techpowerup.com/download/ati-atiflash / Save your bios first!) There is another option, if there are several profiles on your video card, you can try switching to them, if not, then only sew a fresh bios, if you do everything according to the instructions and sew up the correct bios, everything will be fine!)

Make sure you are running separate power cables to each power input on your GPU

If you can test with a different PSU .. that would be the next thing I'd try


ThreeDee PC specs

I just tried doing this and will keep you posted. That said, if the issue was with power delivery being bottlenecked by going through 1 cable, wouldn't the crash tend to happen while the GPU was under high load and was trying to draw more power?

0 Likes

Just upload the text file somewhere else and link that here.
Not sure if this forum supports text files at all.

Maybe I can add another crash type/reason to my list, after we figure this one out.

---
more ideas:

Do you have the GPU core clock and voltages logged in that GPU-Z log file?
1) Are you ever boosting above 1900 MHz?
- Most air cooled RX 5700 XT cards can not handle high boost clocks that well and tend to crash.
2) are you ever dropping down below 800 MHz (power state 1)
- if you drop below that, it could be "deep sleep" related

you could try to disable the deep sleep power saving states:
-> (to prevent GPU core clock to drop below 800 MHz / power state 1)
-> MorePowerTool from igorsLab is able to do access these settings
- less demanding games might often drop GPU core clock down to 6 MHz, even while gaming
- I am not sure if that in itself could become a crashing reason - but we can try to prevent it

If you want to try it, then use this guide:
1) download and start MorePowerTool as administrator
2) select your GPU from the drop down list
3) click the "Feature Control"-button
4) remove (untick) all checkmarks that start with "DS_" (deep sleep)
5) click "ok" and then "write SPPT" (soft power play table, which are registry entries for the driver)
6) restart your PC (registry changes were made)

I have made this change a while ago and have been running my RX 5700 XT like this (link) without problems.
Core clock never drops below the 800 MHz (power state 1) value, that you can manually change.
- power consumption for idle/desktop use goes up from 11-12W to 13-14W ... not that much.
- temps stay the same for me and GPU is more responsive in some use cases

 

--- [ CPU: Ryzen 7 3800XT | GPU: ASRock RX 5700XT Challenger Pro 8GB | driver: 24.1.1 ]
--- [ MB: MSI B550-A Pro AGESA 1.2.0.7 | RAM: 2x 16GB 3600-CL16 | chipset: 6.01.25.342 ]

Update: Tried splitting up the 8 and 6 pin power connectors across 2 cables going into 2 different rails on the PSU as one post suggested, tried flashing the BIOS on the GPU, and tried this http://cfile26.uf.tistory.com/image/25332F425417ABB5072D62 pulled from a Reddit post about a black screen crash resulting from a janky display port cable. Still able to replicate the crash. 

 

The clock speed does poke above 1900MHz by a little bit in my log, but it's generally below that and the temperatures don't get about 90C even when it is. I do see these deep sleep states you mentioned though. Looks like in game the card clock drops down as low as single digits for 1-3 seconds at a time. Strange though because I would assume that would correspond to little micro freezes in game when that happens, but it doesn't. Everything runs very smoothly right up until the system crashes. Before I go messing with power states I'll take another log using a Vulkan game and see if I see the same behavior. If that behavior is missing there then that's good enough justification for me and I'll try disabling deep sleep.

 

I've attached a spreadsheet via google drive of the log I've been referencing. https://drive.google.com/file/d/1CYNz46F_tJ3EudQXgP3rsdkO5F_qAoUg/view?usp=sharing

Row 4390 at 2022-01-08 18:03:23 is the last timestamp before the crash.

0 Likes

And how do you have a video card connected, 2 wires go separately from the power supply and they both go into the video card? Or is one divided into 2 and connected to a video card? There should be 2 cables separately from the PSU (one cable for 8 pins, designed for 150 watts, 2 cables 300 watts, you have a 250 watt+ card - I don't know the exact model of your card)

0 Likes

The card has an 8 and 6 pin connection. The original setup going from my PSU was one cable that split into an 8 and 6 on the GPU end, but based on your earlier post I grabbed a second cable and split it up so that I have 1 dedicated cable for the 8-pin and 1 dedicated cable for the 6-pin and the crash persists.

 

Edit: Additionally, since the last post I've seen new behavior. Dota2 specifically is now also crashing when using Vulkan, which was not the case previously. Here is a GPU-Z log capturing that crash.

https://drive.google.com/file/d/1q4xErbJqRXtPs0UnLhTr-_lhgx3S1BeU/view?usp=sharing

0 Likes

The only thing that remains is to put another power supply, it's just that seasonic power supplies made noise earlier, especially the top ones, they were cut down when the load rose sharply, and they were cut down on the Internet a lot earlier they talked about it, it seems like they corrected it in another batch. https://occlub.ru/news/hardware/32845-seasonic-preduprezhdaet-bloki-pitanija-focus-plus-imejut-probl... perhaps this is your case.

Well, I don't want to celebrate too early since I've had periods of stability before, but I went out and grabbed a Corsair RM1000X to test with (I know it's overkill but it's all my local store had) and so far it hasn't crashed. I'm going to continue to run with this PSU over the weekend, if it crashes again I'll update the thread, and if it doesn't crash I'll check back in on Tuesday!

0 Likes

@spencerosity I have looked through your logs and did not see anything out of place there.

So at least the GPU was within its limits and the logs show that everything was working fine.
If the new PSU fixes the crashes, then I should indeed add this behavior to my crash type list.
Might help someone else to solve the problem as well.

 

--- [ CPU: Ryzen 7 3800XT | GPU: ASRock RX 5700XT Challenger Pro 8GB | driver: 24.1.1 ]
--- [ MB: MSI B550-A Pro AGESA 1.2.0.7 | RAM: 2x 16GB 3600-CL16 | chipset: 6.01.25.342 ]
0 Likes

Bad news. Same crash happened again. It seemed to be behaving much better than it has  been lately with the new PSU, but the issue isn't fixed.

0 Likes

RPX100 has some very good information on the 5000 series.

I noticed in the log that the GPU load goes to close to 100% and the fan speed actually reads above 100% right before the crash. Maybe set a custom settings for the Mhz on the GPU so it is a little lower than what the Mhz is set by the manufacturer, it seems it could be related to the fact that the card is being pushed very hard and it is deciding to shut it down. If you reduce it a little, 10-15%, and you no longer see it getting so hot, it may fix it. I have read several posts with people recommending this, as some cards may be boosting a little too high by default, and it may even depend on the game and what the GPU is doing that can cause more heat. I hope you can find a solution.

Edit: It does seem strange that the GPU and Hotspot temps are only about 60, but the fan is going over 100%. Also if the GPU load is very high but the temperature isn't that high, you would think that would be ok. Also the GPU Mhz do seem to drop very low sometimes.

Double check this information on your power supply, sometimes you even have to use specifically PCIE1 and PCIE3, If you choose PCIE1 and PCIE2 your PSU may not be providing enough Amps, it depends on the design of the PSU:

https://community.amd.com/t5/graphics/sapphire-pulse-rx-6600-xt-black-screen-driver-instalation/td-p...

0 Likes

So, because the temps didn't appear to be the problem from the logs, I hadn't really thought about re-pasting the card, but more out of desperation than anything else I just did that. The temps aren't any lower, but I can say that the paste that was on the card was completely dried up and there were noticeable bare spots on the die. I don't know how/where gpu-z reads hotspot temp, but is it possible that specific segments of the gpu were hitting an unstable temp threshold that isn't getting captured by the logs?

0 Likes

Anytime there are bare spots it could be that it is heating up so quick it shuts down before it get to the logs, and yes anytime there are bare spots that is a possibility that was it. With the goldmine that is making GPUs, it seems some manufacturers are taking lots of shortcuts with Quality Control. You may want to let Gigabyte know, they can do better, I must say I have heard a lot of horror stories with paste and thermal pads on cards not being done right.

I hope this fixes it for you, please let us know how it goes.

0 Likes

Still crashing after repasting. I really appreciate the help everyone here has given me but I think I'm ready to call it and RMA the GPU. Trying it in 2 completely different PCs has isolated the issue to the graphics card itself. I've spent a lot of time underclocking, undervolting, and messing with the fan curve with no improvement, and I've now disassembled the thing to re-apply the thermal paste. I think it's time to put the ball back in the manufacturer's court. I'll come back and update this thread after I get the card back to report any improvement or lack thereof. 

0 Likes

Seems you narrowed it down, it seems the card has problems, good to get it replaced. Bright side now you can troubleshoot better, enjoy your new card, you deserve it.

0 Likes

@spencerosity I agree with @amdman : seems like you narrowed it down.
You really seem to have done a lot of work to figure this out.
After all this work I think it is indeed time for an RMA : I hope it turns out well.

Best of luck with the RMA and the new GPU!
Keep us posted how it goes. Cheers

 

--- [ CPU: Ryzen 7 3800XT | GPU: ASRock RX 5700XT Challenger Pro 8GB | driver: 24.1.1 ]
--- [ MB: MSI B550-A Pro AGESA 1.2.0.7 | RAM: 2x 16GB 3600-CL16 | chipset: 6.01.25.342 ]
0 Likes
RPX100
Miniboss

@spencerosity hello there. I have also encountered this crash in the past and might be able to help.

Please have a look into this topic: 
RX 5000 series crash types and reasons

The way you describe your issue sounds like crash type #3 from my post.
I have encountered this myself with older games / less demanding games.
The main reason for this crash type seems to be a driver bug with power saving features for memory.

Please use GPU-Z or MSI Afterburner to monitor your memory clock while playing games.
You just have to check if your memory clock stays locked around 1750 MHz (= good for gaming)
or if it fluctuates between the different power saving states (less than 1700 MHz, = bad)

side note: locking memory clock to max is normal behavior for fullscreen/gaming
The driver bug somehow tries to adjust memory clock dynamically a few times per second
- look like the driver thinks that you are idling on desktop and tries to downclock memory to save energy
- power consumption difference: between idle (11-15W) and gaming (locked clock: 33-35W)

--
If that (memory clock fluctuation) is indeed the case for you, then the solution is:
Try to lock your GPUs memory clock to max clock (1750 MHz) if you can.
One way that works for me is using 2 displays with different resolution/refresh rate
- it may also help to increase refresh rate to 120/144 Hz if you can

Another thing that can potentially help is to run the games in fullscreen mode
and not use borderless mode, in order to let the driver know that a fulscreen app is running.

 

--- [ CPU: Ryzen 7 3800XT | GPU: ASRock RX 5700XT Challenger Pro 8GB | driver: 24.1.1 ]
--- [ MB: MSI B550-A Pro AGESA 1.2.0.7 | RAM: 2x 16GB 3600-CL16 | chipset: 6.01.25.342 ]

Thanks for the feedback, but unfortunately this doesn't appear to be the issue. I have some GPU-Z logs including one that captured a crash and the memory clock never fluctuates outside of 1744-1748. Additionally, I already am using a setup similar to what you recommended as a fix. Primary screen is 1440p 144Hz, secondary screen is 1080p 60Hz. 

Going through the logs while looking at that topic you linked, I can also say that my temps stay below 90°C and that my hotspot hasn't gone more than 16°C over my other gpu temp. 

I'd be happy to post the log but I'm new to this forum and don't seem to be able to figure out how to attach it to the post, and the formatting if I copy/paste it from the .txt file is neigh unreadable. 

 

 

jsanchez
Journeyman III

I need I must reply as I could finally, after 1 year of suffering, play games for more than 3 hours without crashing.

What I did:

-Upgrade every single driver: Main board, chipset, GPU, etc.

-I'm on 22.5.1 and had to tune the GPU via AMD Adrenaline software

Quick guide:

On Performance --> Tuning --> Global Tuning

GPU Tuning: Enabled --> Expand "Fine Tuning Controls" and set the last dial to (values can change a bit, +/- 2):

Frecuency: 1909

Voltage: 997

Do the same for Fan Tuning:

P1: Temp: 25 Fan Speed: 33

P2: Temp: 40 Fan Speed: 40

P3: Temp: 60 Fan Speed: 42

P4: Temp: 65 Fan Speed: 48

P5: Temp: 85 Fan Speed: 95

Off course, some FPS will be lost, but not that much, I'm playing Horizon Down, and went from 75 FPS on Ultra to 68 FPS --> I prefer to lose some FPS but being able to play without worrying about when will it crash.

Temps now don't go over 60-62°C (70-72° JT).

I'm happy now...

0 Likes