As per the discoveries in the older thread, core state 0 (and similarly HBM states 0-1) has inadequate voltage supplied to the Vega SoC to remain stable, most particularly in certain application workloads, VSR configurations and/or multiple displays. The instability causes the card to hard-lock and shut down its PCIe link to the system, subsequently causing a whole system HALT outside of OS control (ie; no BSOD or driver control, complete system hang with audio buzz/crackle/loop).
This occurs on several systems and several cards, including but not limited to the vega 64 air and liquid models, vega 56, probably the vega FE but not yet confirmed, reference and aftermarket models...
i7 3770k, i7 4790k, ryzen 1600X, ryzen 1800X, i7 6900k, X370 boards, 750 and 1000W platinum and gold PSUs, and so on...
has not been fixed since the release driver, so all known driver releases for vega are affected...
Does not however affect APU vega models, as the SoC on those is fixed to 1.1-1.125 volts, which is the common load voltage between all vega models.
From current experimentation, .850V is the minimum stable SoC voltage, whereas most vega cards dip to .740-.800, which causes either immediate or delayed SoC failures when certain workloads are applied (as per the first paragraph).
Is there a solution for this yet?
Still waiting for an official response...
This has now worsened with 18.8.1, you can readily trigger complete vega SoC failures with just a single display, what differs in this case however is that the vega display will now black-out as opposed to remain frozen or be a random colour. The triggers remain the same idle or state-0 2D loads as before, and the system still remains unrecoverable due to the complete PCIe link failure, it will often cause motherboard BIOS's to reset or enter safe mode due to the nature of the failure, as such as that with the crosshair hero.
Same here, Vega FE crashes even using the native Adrenalin Driver 19.4.3. Random dxgkrnl.sys + atikmpag.sys BSOD (using dump file viewer) when system is idle especially when monitor is in power-saving mode. However the system is very stable when GPU is in-use.
Installing an older Pro driver and switching into gaming driver, under wattman set minimum state to 1 will fix the random crash. Not a single crash for a month without rebooting.
With the gaming driver option is now officially discontinued, please fix this since we don't have wattman anymore.