bitcrusher

Vega 56 issues force hard reset and cause BSOD

Discussion created by bitcrusher on Jun 24, 2019

System crashes at random (during gaming or general use) shortly after Gigabyte Vega 56 installation. Drivers are up to date. Several occurrences over the month or so that I've owned the card. It's extremely frustrating and I (and several others, it seems) need a fix.

 

System Info:

 

MOBO: MSI Z77A-G43 (latest BIOS installed)
CPU: Core i5 3570k @ 3.40 GHz
GPU: Gigabyte Vega 56 w/64 BIOS (Radeon Software v19.6.2)
RAM: 16GB (4x4GB) Patriot DDR3 @ 533MHz
SSD (Primary): 232 GB Samsung 860 EVO
HDD (Secondary): 931 GB Western Digital WDC WD1001FALS-00E3A0
HDD (Tertiary): 232 GB Seagate ST3250410AS
PSU: Corsair TX 750W 80+
OS: Windows 10 Pro ver 1809
Monitors:
    ASUS VE276 (main, HDMI)
    2x HP E222 (secondary, DP)

 

Scenario:
All monitors go black, system is unresponsive, and GPU fans hit max speed.
This forces a hard reset.
Upon boot I receive a message from the Radeon app stating "Default WattMan settings have been restored due to unexpected system failure."

 

Global WattMan settings had never been altered from their default state at this point, which was set to "Balanced" and "Automatic".
I found that with this setting (again, default), my fans were not spinning and the GPU instead opted to passively cool itself.

 

Windows Reliability Monitor shows several hardware errors over the weeks since installing the GPU.

(yes, all of these are GPU crashes!)

This crash in particular reads as follows:
    Windows - Hardware Error - 6/23/19 12:24 PM
    Problem Event Name:    LiveKernelEvent
    Code:        141
    Parameter 1:    ffffe10521312460
    Parameter 2:    fffff80464110500
    Parameter 3:    0
    Parameter 4:    2fc
    OS version:    10_0_17763
    Service Pack:    0_0
    Product:    256_1
    OS Version:    10.0.17763.2.0.0.256.48
    Locale ID:    1033

 

I attempted to set WattMan to custom and set Speed/Temp on fans to Manual, with power limit set at 0, which allowed them to spin up.
Shut down my PC, allowed Windows Update to run, and ran some errands.

 

Came home and spent a few hours playing Beam.NG Drive and had no issues whatsoever.

 

After closing Beam.NG Drive, I started simply scrolling through Twitter and then my system crashed with a BSOD error THREAD_STUCK_IN_DEVICE_DRIVER.

 

This crash in particular reads as follows:
    Windows - Shut down unexpectedly - 6/23/19 11:49 PM
    Problem Event Name:    BlueScreen
    Code:        100000ea
    Parameter 1:    ffffdd8c160d9080
    Parameter 2:    0
    Parameter 3:    0
    Parameter 4:    0
    OS version:    10_0_17763
    Service Pack:    0_0
    Product:    256_1
    OS Version:    10.0.17763.2.0.0.256.48
    Locale ID:    1033

 

Once recovered, I disabled AUEPMaster through Radeon settings, based on some advice from another user here. DXDiag shows that this app crashed:

 

    +++ WER4 +++:
    Fault bucket 1988328200873898034, type 4
    Event Name: APPCRASH
    Response: Not available
    Cab Id: 0
    
    Problem signature:
    P1: AUEPMaster.exe
    P2: 1920.1.7.612
    P3: 5d0109f8
    P4: AUEPMaster.exe
    P5: 1920.1.7.612
    P6: 5d0109f8
    P7: c0000005
    P8: 0000000000029f89
    P9:
    P10:

 

Anyway, I'm at a loss here. I'm not technically savvy enough to know how to fix this and it took hours of research to dig up even this much information. I want to love this GPU but it has given me nothing but problems since day 1. Please fix this, AMD.

Outcomes