Setup:
Ryzen 5 7600X +AIO
850W be quiet system power 10
1x2TB HDD
1x2TB SDD
RX 7900XT
ASRock B650M - HDV/M.2
2x 32GB DDR5 5600 ADATA
Monitors:
1x27' 144hz 1440p (display port)
1x22' 60hz 1080p (display port to VGA adapter - this is a ~13-14 year old monitor)
Behavior:
At some point while gaming (League/Jedi survivor or sharing screen on Teams were most common tests fro the crash) the driver would time out. Second screen video output (YouTube) would turn green and the game would crash. Crashing was not consistent, could happen in 5 min, could be stable for 4 hours. Аlthough i did notice alt-tabbing and going back to the game would increase the chance of it happening.
Disabling hardware acceleration on pretty much anything that is running seemed to help, but marginally.
Windows + driver update seemed to remove the green part of the crashing but only that.
BIOS update + driver refresh only made it so the driver timeout and crash takes longer (but the audio of background video on YouTube would keep going)
I did remember that while the RAM sticks are identical i did purchase one after the other from a different retailer.
I did Mem test - 7.5 hours - 0 errors found
but I kept dinging at this hole
Swapping sticks, running only one etc.
As far as I can tell the second stick i purchased is more unstable
Both sticks - Crash time ~ 15min - 3h
Stick A - So far no crash
Stick B - Crash time 1min - 15min
I'm still trying to figure out if this is really the cause or if I am imagining things and will keep updating (I've been troubleshooting this for 3 weeks now)
But if this is indeed the case I wonder why this would be the behavior.
At this stage what I KNOW FOR SURE:
Driver changes have impact but only change the nature of the crash
HW acceleration has SOME impact but not much
RAM slot and RAM stick (A or B) has impact even if RAM sticks are supposed to be identical
Solved! Go to Solution.
Had to go the the retailer service shop to reproduce the crash, but after that they were able to reproduce it consistently so the card is getting RMA'd.
For context on reproducing the crashes:
refresh-rate was irrelevant
Resolution was key - on 1080 - no problems
on 1440 - crash is <2 min
BUT NOT if you idle load it!
just putting on surviving mars and leaving it running system could run for hours
but the moment you start moving things arround/alt-tabing (actually playing a game) it would crash withing 2 min. This was replicatable across multiple games.
Furmark could also run for hours - no issues.
Now I'll have to wait up to a month for a final resolution (repair/replacment). Will update when i get it back.
Update - Stick A also crashes but 1 crash out of 5 1h sessions instead of every session.
I am testing having the old monitor use the integrated graphics (its browser and YouTube only anyway)
Edit: Which proved immediately useless
I am open at Ideas at this stage. I am so fed up with AMD drivers at this stage i am just about to give up and put this piece of .... for sale and go back to team green.
Update:
Sent the card back - there is nothing wrong with it so couldn't get it replaced.
I have had progress in identifying things that affect the consistency of the crashing but each one is a unique scenario.
In League of Legends - lowering the max framrate from the game to 144 (same as the monitor refresh rate) seems to have solved the issue and I haven't had crashes since.
In Surviving Mars as soon as the game starts and I go in research tree the system hangs so now at least i have a consistent crash point that I can use for troubleshooting.
It would be helpful to know when the next driver release is to see if that changes the behavior. Based on my history it will change the crash behavior but i really am not optimistic that it will fix anything.
IF you run the second monitor off of your iGPU, does that help at all?
..and just to verify, do you have the latest AM5 chipset drivers installed? (Usually AMD website is latest, but right now ASRock has newer AM5 chipset drivers listed .. sometimes manufacturer's drivers will have a different numbering scheme, but I verified that ASRock's AM5 chipset drivers does introduce newer drivers .. I forget which of them were newer, but looked at Device manager at driver version and then compared with what the installer was going to install)
Also have a look in event viewer/windows logs/application for application service errors. That could help narrow down the source of the issue and may help find a solution or workaround until the issue can be addressed by the developers.
I installed everything out of Asrock that I could when I updated the BIOS with no luck.
Now that i have a consistent way to force the crashes I finally can troubleshoot properly.
Going single monitor doesn't seem to change anything.
HW info indicates Shader clocks jump from 600 to 2000 then i get a crash
they try to boost to 2800, GPU boosts to 2k
Event viewer is thoroughly unhelpful. Basically boils down to hardware error.
But like i said. Now that i have a consistent crash source i will try to compile a more structured view of what is going on.
I had driver timeout issues like you with my rx6600, although we don't have the same generation of GPU i'd like to share with you the work around i found + the final fix.
like you limiting my fps ingame alowed me to play longer sessions before having driver timeouts, and like you i tryed every fix i could find in the internet.
my work around was undervolting my card + slight undervolt like 30mhz it almost fixed the issue but sometimes with driver updates the settings were not stable.
the final fix was one year after buying the card and it was updating the Bios, i know you already tryed it once but like you i had to try multiple Bios until gigabyte dropped a stable one for my issue.
PS undervolting 6000 series gpu's was so much easier than with the 7000 series GL
I guess that is my next try, but at this stage im probably gonna start looking to just go team green.
FYI.. Undervolting and lowering the clockrate didnt work
I assume that will be my next attempt, but at this point, I'm probably going to stick with group green.
@Matt_AMDNow that I have a consistent way to cause this is there any chance to get support with this.
I've been sending reports for like 3 months now with logs and mail and it has gotten me nowhere.
At least any info when the next drivers will be released?
I and many, many others are having the same issue with drivers. Seems AMD couldn't give a toss as the silence speaks volumes on the subject. Different cards but mainly 7000 series from extensive Googling.
We all have expensive paperweights at this moment in time. Maybe it's worth reaching out to YouTubers/influencers as they may actually get a response from someone at AMD once they think sales may be hit, as God knows we haven't.
Utterly appaling behaviour on their part.
Solution: (at least for now)
In one of the 1001 posts about this in the past 6 months someone suggested:
DDU
Disconnect network
Install driver from local file without Adrenalin.
This seems to have solved my issue at least for now. (will keep updating if it resurfaces)
on a side note - I am also getting 10-20% better performance so far.
Worked for 2hours, than crash behavior returned (not black screen and hang but just freeze and crash)
The interesting thing is i saw significant performance gains on the scale of 50-90% in some instances, but also a lot of visible stuttering and tearing.
Do you think we haven't all tried these common steps that have been posted 1001 times? Are you really that arrogant?
Do you think it's acceptable to have to jump through flaming hoops in order to get something to work?
No wonder PC building/gaming is in the state it's in when shills defend sub standard product.
Sorry to hear that my fix doesn't seem to work for you.
One thing I noticed with my card is that it never timed out on benchmark like heaven and others, maybe your card is the same and when you RMA'd the card back they tested in benchmarks and not real games ?
You should also consider trying your card in a friend's build, and maybe request asrock for a costume bios I know Asus does this stuff on request.
Yes, benchmarking is the same, i was thinking of bundle RMA'ing the whole build.
But testing with a friend or asking Asrock for a bios build might be worth trying at this stage
@SnoozDid disabling MPO work for you. I talked with some tech guys on the AMD discord as well and that is the most likely source, but doesn't work for me.
P.S: I guess asuss is a lot more open to supporting than Asrock. They basically sent me back to the GPU vendor.
I've never tried disabling MPO I heard it helped with some sleep mode bug and not timeouts
At this stage i am forgetting what I've already tried and what not so i went with Chipset driver update.
from Asrock - no fix
from AMD - some of the drivers updated
but i am also getting some failures:
Name : AMD GPIO Driver (for Promontory)
Version : 3.0.0.0
Install : Fail
Name : AMD PSP Driver
Version : 5.24.0.0
Install : Fail
Not that it's going to matter .. but ASRock's AM5 chipset drivers should put you at 5.25.0.0 for PSP Driver
That is probably it.
I've tried so many things at this stage i decided starting over is warranted.
Reinstalled Windows and checked - asrock does put me at 5.25.0.0 so probably that was why AMD wasn't able to install it.
P.S: issue still happens after the re-install.
Im wondering if i should try reverting to Win10
Hi ! Have you tried to UV and down the Clock à little ? it fixed my driver timeout problem !
I've Rx7900xtx , and i think some games are poorly optimized with radeon ....
yep, Tried going down to 2650 and 1050 mV but still get **bleep** all results.
I've reinstalled and tried everything again 1 by 1- still **bleep** all results
The only games not crashing right now are League (if i limit framrate to 144, otherwise it crashes)
And EU4 (which is hard limited to 144)
Surviving mars and Frostpunk crash in the first 3min.
I can limit crashing if i go lower resolution-windowed mode.
the one thing i know for sure is crashing IS impacted by alt-tabing
crashes happen without that as well, but if i alt-tab or go to second monitor they are highly more likely to trigger.
the only thing I haven't tested yet is changing the DP cable.
But considering that putting my old GTX 1060 and everything runs smooth i kind of doubt that it will have any effect.
I think the solution I will go with is trying to RMA the whole system at this stage because they don't accept just the card (since it doesnt crash in stress tests).
or at least try to claw some money back for the card. And i am definetly going team green again. This is just not worth the 3 month hassle of troubleshooting
Something weird is happening with the AMD Adrenalin stress-test.
I set it for 180 sec.
Arround 1 min of running of running 311W 2464Mhz
it just dies down. I got a driver time out 1 time but otherwise it just goes limp while the stress test is supposedly still going (see the screenshot)
HWInfo is showing GPU Max Power draw of 421.8W
however when it goes limp its not from a power spike it goes from 319 to 49
It is starting to look more and more like a HW issue. I thought i was imagining that performance is continuing to degrade, but i've started to see stuttering on youtube video and stress test's now don't even go beyond 20s before they drop the power draw. (there is no error or anything, it just goes down and thats it)
I got same problem, driver timeout in different games. Made zero effort to try and fix it.
Use adrenaline driver to lock max clock @ 2580Mhz and leave voltage and power level @ default and test some games that are crashing and see what happens. Some vendors are pushing these cards too hard in Mhz dept. I lock my 7900XTX @2600 Mhz max and tune for minimal power consumption and it really work well for me.
No ammount of underclocking seems to work with me. I started to suspect the power supply so I lef the rig at the retailer for testing and diagnostics. It could be because my PSU is 2x2 8pin cables so i alays have to use one in a daisychain. Or the PSU might be overheating and shutting down draw. Will wait and see (should know in 1-2 days)
Yea that's not gonna work because of the physical limitation of the wires alone. Your GPU in that case will be limited to 220 watts at absolute best. A single 8 pin (essentially that is what you've got) is technically limited to 150 watts but in reality can supply approximately 200+ watts before things start getting out of hand. That GPU can require up to 325+ watts to operate as designed so a minimum of two separate cables with one daisy chain allowable. I always recommend a 1000 watt or higher PSUs as they natively have four independent PCI-E power power cables and have higher OCP trip levels guarantied to supply the necessary current to the GPU and do not get taxed to the point of losing their efficiency rating as well as a much longer service life.
I ran for about 4-5 months with no issues. And stress testing I get solid 310W out of the PSU (2 cable ~150W each + PCI-E can handle ~75) but im guessing this setup strained the PSU and it started to degarade.
If that is even the problem. I still dont have a response from the retailer, but i should hear from them withing 2 days.
Had to go the the retailer service shop to reproduce the crash, but after that they were able to reproduce it consistently so the card is getting RMA'd.
For context on reproducing the crashes:
refresh-rate was irrelevant
Resolution was key - on 1080 - no problems
on 1440 - crash is <2 min
BUT NOT if you idle load it!
just putting on surviving mars and leaving it running system could run for hours
but the moment you start moving things arround/alt-tabing (actually playing a game) it would crash withing 2 min. This was replicatable across multiple games.
Furmark could also run for hours - no issues.
Now I'll have to wait up to a month for a final resolution (repair/replacment). Will update when i get it back.
After 2 months, MSI kept refusing that the card was the issue but replaced it and i've had no issues since.
noteworthy is that the new card is running consistently 2x higher clocks and 5x more memory on the same games that it used to.
In general after talking with some IT old-scoolers the most likely cause would have been a bad capacitor (or other instability in the power delivery)
Good for you bro happy ending