Hi, I've been struggling with this issue for a couple weeks now, with about one BSOD a day. It's usually the same BSOD, or a game will crash to a black screen before coming back with visual glitches in the corner and clusters of offcolor pixels. The GPU usage lights on my Vega Card will go dead to show that the card shut down. I have a couple dmp logs, but only 2 full Memory ones saved because I didn't think to save those individually before. I'm just narrowing it down to a bad PSU or a bad GPU, and looking into RMAing my card soon because I'm pretty frustrated with it.
The card seems to crash under anything that ranges from light to heavy stress, but only in videogames like Apex Legends, Risk of Rain 2, or MTG Arena. It never crashes during extended stress tests with MSI Kombustor for 30min, or Valley Superposition. It was also fine during FireStrike.
The main handful of crashes were Thread Stuck in Device Driver, but I also got one Video TDR Failure and a system service exception when I swapped the sub card in without uninstalling the old drivers, but it's fine now.
What I've done to narrow it down:
-Disabled all CPU/RAM overclocks and put the GPU into the spare BIOS that has a power limit.
-Tried a spare card that's weaker and older (750 Ti) and repeated the same things that seemed to replicate the problem (Loading screens of some games)
-Ran Memtest86 and Prime95 for hours each, with no errors. Also tested GPU memory using MemtestCL and had 0 errors.
I have the dmps available to post but after changing to my old card, it appears to just say dxgkernel by default instead of the atikmpag.sys and atikmdag.sys it said for the modules before. The error codes might still be intact, however.