I'm sorry in advance for the long post! I recently upgraded my HD7870, fully stable, to a Vega 64 to better utilize my FreeSync monitor. I've been troubleshooting this for days now, and I'm able to replicate the problem consistently, but sometimes the crashes will come randomly, not while gaming or doing anything particularly taxing on the card. After looking up and down in forums, I find people have experienced similar issues, but either they've not been addressed by AMD or users who have solved their issues haven't reported back with a solution that fits my circumstance. Please help!! I've had numerous crashes of the following characteristics:
All with no BSOD.
My setup is this:
I can replicate the third issue simply by running Windows Experience Index. Without fail, it will crash and restart. Other graphics stress tests cause the monitor freezes. Crashes have also occurred after turning off FreeSync, using a single monitor, updating all drivers, using Power Save / Balanced / Turbo / Custom settings with more and less power draw. I've used Memtest86 to run a full memory check, with 0 errors. CPUz to monitor and stress test CPU, no crashes or anything out of the ordinary. And I'm using HWMonitor to keep a close eye on all my temps and clocks, but nothing jumps out at me :<
I've used WhoCrashed, and have only returned one error report (after crashing the 2nd mentioned way). This is what it said:
On Wed 12/19/2018 4:45:51 PM your computer crashed or a problem was reported
crash dump file: C:\Windows\Minidump\121918-20826-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x4874EC)
Bugcheck code: 0x124 (0x0, 0xFFFFFA800E4F08F8, 0x0, 0x0)
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).
This is likely to be caused by a hardware problem.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.
Also, AMD Radeon Settings frequently tells me that "Default Radeon WattMan settings have been restored due to unexpected system"
Might I just have a defective card? I don't know what hardware problem there could be! Thank you so much for any help you're able to offer me :< I hope I don't have to just return it for a new one, but it's my last resort if I can't find another way to fix it.
Is AMD relive replay working? I used to get crashes with audio continuing when the encoder crashed Radeon Settings usually when overclocked but not always.
I'd definitely check the power PCIE power cable that is connected from your GPU, I was having similar issue and eventually the PCIE port on my PSU was melting the 2x 8 pin power connector at the PSU. eventually all my PCIE power cables were damaged and I couldn't use it anymore. The cabled were impossible to remove without damaging the port on the PSU as the plastic melted and jammed it into the port.
Check your PSU and GPU power cables aren't having similar issues.
Thanks for the reply! Actually, I never installed Relive since I didn't find use for it, but I will.
I'll take your advice and open it up and poke around.
I'm suspicious now of the power draw, since I turned on Dark Souls 3 with max settings on 2K, and it didn't push my draw above 200W. I don't know if that's uncommon, but seeing as it recommends a 750W PSU minimum, I have to wonder. Also, looking at Tom's Hardware, it's says at idle, I should be drawing on average 18W, and with multiple monitors at least 25W. But I'm sitting idle at 3W with 2 monitors.
Idk how it varies or how uncommon this is, but I'll look into it and write back.
No luck. I opened up my desktop, tried reseating everything, swapped out cables, and I still ran into these problems.
Just to be sure I didn't damage any of my other components while installing my Vega, I swapped it out for my old 7870, ran some stress tests, and it went as smoothly as it ever has.
Regretfully, I'll have to replace the Vega. I'm sincerely hoping I just have a defective unit and a simple replacement will solve all my problems. I'm going to get that going today, and report back after I've received my new unit... Thanks again for you help!
My Vega 64 was acting up and rebooting the computer on any heavy testing. I found a 2 prong ping next to my 6 prong adapter on one of my cables not quite all the way plugged in. Pushing it in all the way fixed the problem.
Try to lock the Vega's HBM2 memory through wattman.. set it to highest state 3 min/max .. (left mouse button on state 3 select min and do it again en select max).. maybe this will help for stability..
Check Event Viewer for the times you have issues. Hive suggestion is a great idea regarding HBM2. Also try increasing the power draw to +50.
The event viewer might possibly give you some more clues. Even check for other programs or processes that have activity when the issue comes up and see if it's also happening on other occasions.