I have been playing The Division and almost every time i get a crash at some point - the dreaded black screen / fans go to 100% / reboot.
I also get quite frequently the 'Default Radeon Wattman settings have been restored due to unexpected system failure".
I have a 16:10 (2560 x 1600) monitor with a fixed 60Hz refresh, so I have been using a combination of FRTC (60fps) combined with the Auto Undervolt GPU feature (using Radeon Settings 19.3.1). I don't get as many crashes when using just FRTC on its own. Perhaps FRTC and Auto Undervolt GPU don't play nicely together?
I have not yet tracked down the reason behind the game crashing, but have read that the Auto Undervolt GPU feature might be the culprit - but I have read opinions claiming that it is either too conservative or too aggressive compared with trial and error manual undervolting. I have also read that crashes might be caused by an uneven distribution of contact between the die and the copper base of the heatsink, causing localised heating seemingly beyond the capacity of the reference cooling solution to manage.
Can anyone give me some tips/tricks to a logical process to try to identify the cause. For example, how do I record the Junction Temp in real time while gaming as the standard Radeon Performance Monitoring Data recording does not record Junction Temp as far as I can tell, and I can't tell if GPU-Z does not record Junction Temp either. Am I missing something obvious in not knowing how to record Junction Temp to a file that I can look back after recovering from a crash?
I really like Vega 20 in all its 7nm and 16GB of HBM2 glory, but it is a very expensive card and having crashes while gaming is quite frustrating. I am really hopeful to work towards understanding what the root cause is (heatsink vs Auto Undervolting vs driver vs Win10), but would be keen to hear of other's experience with this card and any if there were areas I should focus on.
Sounds like voltage issues to me. Are you overclocking at all? If so, what happens on default settings?
Hi qwixt,
Thank you very much for responding. I can clarify that I am not overclocking, only undervolting. As my monitor has only a fixed 60Hz refresh (albeit at 2560 x 1600), I am using FRTC to cap at 60fps, but I am keen to run the Radeon VII as cool and quiet as I can, hence my using both FTRC and Auto Undervolting GPU combined.
I get the same thing. If i use auto undervolt, eventually it will crash. I have not tried without undervolting because it just gets way too hot. I can be stable in heaven with a good undervolt overclock for hours and hours but then division or even the desktop web browsing will crash and hard lock my pc. If i play without undervolting (keeping everything stock) the card just throttles.
Are you using DX12? DX12 has some issues.. I fixed my division 2 crashes on my Radeon VII here: https://www.reddit.com/r/thedivision/comments/b24geh/another_dx12_crash_fix_exit_discord/
Basically disabled all overlays (including the forgotten AMD overlay), made sure firefox and discord weren't open (they cause DX12 crashes for some reason) and disabled fullscreen optimizations on the division 2 executable.
You can use GPU-Z to monitor/log the temps and clocks
I dont see how DX12 is causing the crashes and lock ups i have. I dont use any overlays at all. Running with just a decent undervolt. Temperatures dont go past 111 junction. Just getting frustrated now.
Just finished a 3 hour Division 2 session, no crashes, 1440p, max settings, uncapped fps with Uplay overlay + Radeon overlay
I have been playing The Division and almost every time i get a crash at some point - the dreaded black screen / fans go to 100% / reboot.
I have a Radeon Vii, and I had a Vega 64 for about a year, both undervolted. That is definitely an undervolt crash.
Try to hard-reset your PC the moment you know that happened, don't let it crash all the way.
Here is my undervolt setting for my Radeon Vii ( note, each card is different )
but have read that the Auto Undervolt GPU feature might be the culprit
I think so too. I suggest not using the Auto Undervolt feature at it's current state (driver 19.3.2)
Even Auto Overclock too, because we can't adjust voltage or fan curve. So it's running at high voltages and the last two high dots in the auto fan curve settings are, 60% at 94c and 75% at 105c.
For reference, 58% is 2750 rpm.
I don't get as many crashes when using just FRTC on its own.
Perhaps FRTC and Auto Undervolt GPU don't play nicely together?
No, they got nothing to do with each other.
how do I record the Junction Temp in real time while gaming as the standard Radeon Performance Monitoring Data recording does not record Junction Temp as far as I can tell
Unfortunately, the Junction Temp metric is bugged in the Radeon Settings, since the card came out.
Two fixes:
1 - Keep Radeon Settings open ( Global Settings > Global Wattman ) and play the game.
2 - Install MSI after burner for performance monitoring only, don't let it tweak settings.
Here are the metrics for my 3 hour session in Division 2:
GPU Temperature 1 = Edge Temp
GPU Temperature 2 = Junction Temp
I had similar problems, random crashes here and there. Even when I was a able to play for a while, without crashing, every time I shutdown my comp, then turned it on the next day for example, I got the message that radeon wattman settings were reset.
I removed any CPU/Mem overclocking, and since then I have only gotten one crash, and that may have been Div2's fault, not sure.
I run my RVII at 1000mV, and 1200Mhz mem. Recently after that last crash, just for good measure and since I'm OC'ing the mem, I bumped to 1010mV. Only 1 gaming session since that, no problems.
Bottom line for me was my CPU/Mem OC. Even though outside of gaming the system was stable with the mild CPU/Mem overclock I had. This must be a combination RVII driver and mobo/cpu oc problem I guess. Not sure, but
I'll try to OC the CPU/Mem again one day. Maybe when the RVII drivers mature a little. the thing has only been out for around a month, and AMD is updating frequently!
I think it is fixed with the latest 19.3.3 version of Radeon Settings. I have just finished playing The Division for 1.5 hrs with Auto Undervolt GPU enabled and FRTC set to 60 fps and no other overclocking (GPU/CPU/Memory) and did not have a crash. I used the latest version of GPU-Z (2.18.0) to measure the card while gaming and recorded it to a .txt file. Max GPU edge temp reached was 87 degrees, max Hot Spot (which I have read is Junction Temp in GPU-Z) was 111 degrees and max voltage reached was 1.062 volts (my stock voltage at 1801mHz is 1.115 Volts). Seems like AMD are still refining Radeon Settings for the RVII - all good!
Good to hear, with any overlay enabled? Uplay, afterburner, Radeon overlay?
If you are running it stock without overclocking like me, I still suggest manual undervolt + custom fan profile.
Edge Temp went down from 75c to 64c
Junction temp from 105c to 88c
I think I will take your advice and play a bit with manual undervolting and fan profiles. I had another Auto Undervolt crash - dang! I was playing today with both default settings and then I resumed playing with Auto Undervolt while recording both sessions using GPU-Z. I was not playing with any overlay.
The difference in max voltages recorded (default vs. auto uv) were 1.118 volts vs. 1.056 volts. The difference in max Edge Temps were 81 degrees vs. 82 degrees and Junction Temps were 111 degrees vs. 109 degrees (not much change in either of these temps). The max GPU clock speeds reached were 1907 mHz vs. 1923 mHz (remembering that I am using FRTC set to 60 fps on a 2560 x 1600 monitor). The max power (such that it is measured by GPU-Z) was 299 Watts vs. 274 Watts, albeit they were not the same part of the game.
When the game crashed the Edge Temp was only 74 degrees and Junction Temp was only 86 Degrees - so it is not a temp issue, but rather I think the Undervolt is slightly too aggressive for this particular game.
higgih01 wrote:
but rather I think the Undervolt is slightly too aggressive for this particular game.
You are correct, it definitely depends on the game.
I had a Vega 64 for about a year and a half. Undervolted from stock 1200mV to 995mV. I got more performance because it didn't throttle like stock. Temps when down from 85c to 71c under long 3-6 hour sessions. 1440p, max settings, FRTC 87 freesync (fps runs under that, so full GPU usage)
Stable for 8 months without crashing, until I played Rise of the Tomb Raider. The game kept crashing until I raised the the voltage from 1000mV to 1030mV. The only game that needed that voltage raise, even though I capped the fps (85% gpu usage)
Here are my Radeon Vii settings,
I'm trying a new undervolt, from 982mV to 977mV.
Fan profile:
- 1st dot, default
- 2nd dot, 75c - 45%
- 3rd dot, 81c - 51%
- 4th dot, 85c - 58%
- 5th dot, 93c - 62%
Two hours with no crashes in Division 2 yesterday. Devil May Cry 5 @ 4k and Far Cry 5. I'll raise it if it crashes.
Wow - there is something seriously not quite right with the Auto Undervolt GPU feature in Radeon Wattman at the moment. My manual undervolting was a much better solution. I hope this feature gets better with new driver revisions (I am on 19.3.3), but at the moment I would agree with some of you and avoid using it for now.
As for my manual undervolting thus far, I have reduced in steps down to to 1006mV (my stock is 1118mv), but have not yet gone any lower as yet. I played for about 1.5hrs with NO crashes. At 1006mV the average Edge Temp is around 75 degrees and the Junction Temp around 90 degrees. Power fluctuated, but the max power was 242 W (down from 300W), and the ave. was much lower. The thing that was most amazing and enjoyable to me was that most of the time the fan speed was a pretty constant 895 rpm - virtually silent. I did not change the fan profile. Remember that I am limiting my fps to 60 with FRTC and gaming at 2560x1600 (I have a 16:10 panel). Talk about great rewards for only a little effort and lots of fun experimenting - gottta love this card! Thank you AMD for bringing fun, creativity and experimentation to my PC gaming experience! Next step is to push below 1000mV and see what I get, albeit I am mindful that my particular GPU, at a stock voltage of 1118mV, is on the upper edge of average.
Playing the Division 2 was having random hangs my room can get warmish. - started running fans pretty high for temps and helped much. A month ago I just grabbed a GPU EK water block solved it seems the glitches - the Division 2 is very demanding at high / ultra settings. I get very nice auto overclocks np now.
PS I just love my Radeon 7