Hello,
whenever in modern games I encounter artifacts like this:
+ the worst is the constant crashing/freezing. it goes like this
game freezes with black screen while audio gets stuck in mini loop
->crash to destop
->the error line - Display driver stopped responding and has been recovered
->Default wattman settings restored due to unexpected system failure
->with radeon settings stuck
message: Host application has stopped working.
It happens from 30 seconds of game to 30min at max. on average 10 minutes.
In FFXV its rather random but in first city Lestallum its unbearable to point where I can play maybe for 3minutes max, with 30 seconds being shortest time. I know that game is riddled with problems but seems my case is extreme.
Game runs at rather modest settings on fps capped at 30 in 1080p. same with games below (except on those I cap at 60fps)
Happens really often in NIOH too and with less frequency in NFS Payback and Ori and Blind forest or other games.
Doesn't really happen in benchmarks but that maybe just because they ran short time.>
Now im running it on default with temps stable at 75C, I also tried more aggresive temp profile where temps were under 50C but it crashed same as on 75C.
Currently on 18.3.4 drivers, but tried 16.11.5,17.11.1,18.2.1,18.3.2 but it didnt made any big difference.
I Think I exhausted all the fixes I could find here and on the internet, did memtest, prime, tinkered with vBIOS switch, wattman fans, underclocking, voltage settings, tdr registry... only thing I cant do is try swap parts because I Dont have any.
So like the question stands can this be fixed with drivers or some sw change or is this HW fault that needs an RMA of card?
Specs:
Download OCCT (Free diagnostic software) and run the GPU test with GPU ERRORS box marked. This will check your GPU under stress and also check to be sure your GPU isn't producing any errors while under stress. I have read that sometimes poor power to GPU card will produce artifacts. With OCCT you can check your PSU outputs to make sure they are within range. This should eliminate the GPU and PSU Hardware, at least, initially.
By the way, Sapphire has very poor warranty (two years) and very difficult to RMA.
How long should I run it for some proper testing?
Also not sure how warranty in regards with Sapphire works here (europe) since I rma it at shop that sold me card and they are dealing with it. luckily I still have 3/4 year of warrant left.
Bit sad that I only know of Sapphire things now, from what I read before buying it was all EVGA of AMD and other praise for them.
Generally you run OCCT GPU test with Error checking for about one Hour at FULLSCREEN.
Don't check your PSU with OCCT unless you have a high quality PSU. I found out that cheaply made low quality PSU can be damaged running OCCT PSU test. If you run it and as soon as it starts, the computer reboots, than you may want to take a closer look at your PSU. if the PSU test starts and runs, keep a very close eye on it to see it is not being stressed to the point where it might get damaged.
Otherwise, while running the GPU test, OCCT shows you all PSU Outputs during the testing. So, if you see anything strange in the PSU outputs, than you can try running OCCT PSU test. But be careful and watchful while doing so.
Someone else here in the Forum purchased a Sapphire GPU card in Europe, since he quoted in Euros rather than Dollars. Here is what he had to say about the Sapphire warranty after his card went bad 1.5 years after purchase:
grantelb4rt Apr 5, 2018 3:25 AM (in response to Ellis Rodriguez)
I dont think it overheated, I was only browsing the Web... The cooler on that card is so massive that the Fans are stopped by Factory on low loads and Temperatures below 60° (or something around that area).
PSU could be another thing, but I dont have another PSU laying around with 2x 8-Pin PCI-E connectors
I already contacted Sapphire, they just got back to me. They said they wont do the RMA and want me to send the card to the Store where I bought it. The Store told me that they would send that card in for Repair, but if they cant repair it I would get the Time Value of the card... Which is only 150€, from that money I cant get anything close to a Fury performance wise
Here is the recent thread about the other Sapphire GPU card from above comment: Sudden death of Sapphire R9 Fury?
Did that for hour, 0 crashes 0 errors, though the graphs of +12 and others looks bit weird. not sure if its supposed to be like this with those dips, also the test was static
image not sure if its correct or not.
In the meantime I tried reflashing the vbios thats comes with card but its still same crashes in game.
It seems like it is not a hardware problem. Try pokester suggestion of configuring the GPU card. He says it fixed it for him.
Did that as first thing when problems started. also replied to him about it yesterday but somehow it got stuck in mod approval mode.
I think last thing I can try is reinstalling OS after that I Really dont know what else.
Try contacting Sapphire Support and see what they recommend before reinstalling the OS.
If you believe the OS may be the problem, Run SFC /scannow to check for missing or corrupted core Windows files. It should come back with "No Integrity found" or something similar.
Before Reinstalling the OS, make a SYSTEM BACKUP image first (separate hard drive). That way if the same problems continues after reinstalling the OS, you can always restore the original WIndows back to the way it was before. Saving yourself hours of reinstalling all your settings and apps.
You can always install the GPU card in another computer to see if the problem shifts over to the other computer. If it does, then it definitely has something to do with the GPU card.
I absolutely agree with everything you recommended. I would add that I have found it is very beneficial to have a spare hard drive on hand. And anytime I think an OS replacement is in order I test that to that drive. Then I can see if that fixes things without wiping out my installation. Just a idea, to me a drive is much cheaper than my time. I will also boot to Linux from a usb drive or DVD to see if components are working or not from there when troubleshooting.
Have you tried increasing your power limit slider under Wattman to it's maximum? Move it all the way to the right likely either +25 or +50 for that card. Just max it.
The "game freezes with black screen while audio gets stuck in mini loop" is often fixed with this simple setting. My RX 580 did the same thing. Make sure your power settings for Windows are on High Performance too.
I didn't think artifacts could be caused by a misconfigured Gpu card. Good thing you brought that up.
yep I get those same lines when I forget to turn my power limit up when it gets reset.
Yup, done that as first thing when problems started.
Okay then. So you are on High Performance too? Have you disabled all the MS Windows game settings like DVR? Disabled Fast Startup (Windows 10 only)?
Did this issue suddenly show up? One day it was fine then next it didn't? Did the issue accompany a driver change? Have you run DDU and reinstalled the driver clean? If you have not done that I absolutely would give that a try.
From there, if this persists we can try some different settings on fan and temperature controls but I would not go there until you have tried these other things.
If that doesn't do it then I am not sure. You may have a card problem.
Just reply back with the results as you go please. Hopefully you don't have to get a new card.
On High Performance, disabled even fast boot in bios.
It started suddenly (I dont play games that often) in Nioh as first game and from there it was in every game, I always uninstall with DDU
I already tried temperature fixes found in other threads (max temperature and target temperature, setting one frequency at all levels, same with voltage, tried - power limit but I think that was even worse.)
There are definitely different fixes for the different cards. It seems the R9 series was hardest hit by driver changes for some reason. I found on my RX 580 that bringing the max temp down to 60 (the left yellow slider) and the right slider all the way down. Then changing the fan so it could run at max speed when needed fixed my issues that cause temperature instability. I don't even think it is as much temperature as much as changing the threshold where throttling starts that helped.
I really wish they could just release a version of the the drivers that void of wattman to work for all the people that are still having issues. Of course it is entirely possible in your case it could be hardware. It's just hard to believe knowing all the know problems with the drivers.
You might post a screen shot of your Radeon Settings. In the event it shows something someone can say hey change this?
Ill try your tip. but its weird because even wattman crashes a lot whenever I play with frequencies or other stuff, thing that wasnt happening before.
edit: well it doesnt make any difference unfortunately.
I don't ever mess with frequencies or custom voltages. It crashes when I have tried too.
I did some testing and its crashing in 3dmark and Valley too, sometimes does 30 minutes np (Valley) but in 3dmark its crashing really quickly only like 1-2 minutes int otest
and when cleaning I noticed some weird stuff
contacts on one fans is somehow darker color than other one
And i Somehow feel that one fan is weaker, do they send the replacement fans for Europe too or is it just USA thing?
and the cooling pad or how its called has weird shape to it also some kind of metal sheet under it with same problem.