Describe your system:
- Graphics Card:
SAPPHIRE Vega 56 Pulse
ASRock B365 Pro4, BIOS 4.00 (latest)
Kingston HyperX Predator CL13, 2666 MHz
be quiet! pure power 11 500W
Windows 10 Pro 1809
- Graphics Driver:
25.20.15031.9002 (Adrenaline 19.4.3) DCH / Win 10 64
Describe your issue:
When playing Aliens - Colonial Marines (in 1440p virtual super resolution and highest settings for that matter), the graphics card would become unresponsive every now and then (two times now within the first few minutes in) and Windows would reset the graphics driver.
I reset my motherboard BIOS to defaults, never overclocked the GPU and even used DDU to start from a clean slate to no avail. Disabling Windows' graphics watchdog in the registry only caused a system lock up instead of driver reset - as could be expected. The PSU puts out cold air and a watt meter on the wall socket fluctuates between 160W and 230W power draw. GPU-Z shows 148W as the maximum, indicating that the card operated well below its 180W factory power limit.
Here is a screenshot that does shows the GPU up to the point where I completed a mission and Alt+Tabbed to GPU-Z to have a look. When the crash happens it looks pretty much the same except for a few seconds of no readings with bogus data:
What I noticed for the first time now (previously I was focused on temperatures, voltages and wattages) is how insanely high the GPU clocks without hitting the power limit (148W actual of 180W max). Shouldn't there be an absolute clock speed limit implemented somewhere?
It looks to me as if this particular game was using only parts of the GPU and left a lot of margin for the power limit, so the firmware or driver decided that power draw and temperatures are low enough to raise the core clock speed a Ghz more than max. Memory clock seems fine though. And as I said, I didn't modify any setting in WattMan. I looked at it to see all the fancy graphs and knobs, but never ever touched the settings.
For comparison, here is the graph from Unigine 2 - Superposition Medium:
Note how the GPU Clock hits 1.5 Ghz precisely here. (The temperature spikes I would ignore. Don't know why the sensors would drop out every now and then in Unigine 2. At any rate, no, the memory was never 1400°C hot. I would have noticed...)