AnsweredAssumed Answered

Crash over 800mhz HBM

Question asked by winzy on Dec 20, 2018

My recently purchased Radeon RX vega 64 gaming OC 8g from Gigabyte is experiencing almost immediate crashes at stock settings.
Valley bench at stock settings will cause a crash within 15 seconds; blacking out the screen or more commonly creating a white static with some colors throughout ( like an old tv on the wrong channel) The computer never recovers and has to be rebooted.
I have tried 2 different 850w PSU's ~ I even got my older tower out with windows 7, all different hardware except the gigabyte card and was able to create the same crash within seconds.

Everytime I would install drivers I used DDU/safemode. I have tried several older drivers with no changes. My current settings for mem/cpu are OC'd but I have disabled all of it in prior test runs and it didn't change the failure. Temps were all fine at the time of the crashes and I am out of ideas.

 

 

Operating System

Windows 10 Home 64-bit

CPU

AMD Ryzen 5 80 °C (for some reason speccy says 80 but ryzen master reads sub 40 at idle)

14nm Technology

RAM

16.0GB Dual-Channel Unknown @ 1496MHz (16-17-17-35)

Motherboard

ASUSTeK COMPUTER INC. PRIME B450-PLUS (AM4) 30 °C

Graphics

Dell S2716DG (2560x1440@144Hz)

LG IPS FULLHD (1920x1080@60Hz)

8176MB ATI Radeon RX Vega (Gigabyte) 38 °C

Storage

465GB Samsung SSD 850 EVO 500GB (SATA (SSD)) 30 °C

465GB Samsung SSD 860 EVO 500GB (SATA (SSD)) 36 °C

Optical Drives

No optical disk drives detected

Audio

Realtek High Definition Audio

 

Whocrashed

System Information (local)


 

Computer name: DESKTOP-DD03GQ6
Windows version: Windows 10 , 10.0, build: 17763
Windows dir: C:\WINDOWS
Hardware: ASUSTeK COMPUTER INC., PRIME B450-PLUS
CPU: AuthenticAMD AMD Ryzen 5 2600 Six-Core Processor AMD586, level: 23
12 logical processors, active mask: 4095
RAM: 17102905344 bytes total


 


Crash Dump Analysis


 

Crash dumps are enabled on your computer.

Crash dump directories:

C:\Windows

C:\Windows\Minidump

 

On Thu 12/20/2018 8:58:06 PM your computer crashed or a problem was reported
crash dump file: C:\Windows\Minidump\122018-5828-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x1B1B40)
Bugcheck code: 0x100000EA (0xFFFFB08ED7CAB080, 0x0, 0x0, 0x0)
Error: THREAD_STUCK_IN_DEVICE_DRIVER_M
file path: C:\WINDOWS\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This indicates that a thread in a device driver is endlessly spinning.
This appears to be a typical software driver bug and is not likely to be caused by a hardware problem.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.


On Thu 12/20/2018 8:58:06 PM your computer crashed or a problem was reported
crash dump file: C:\Windows\MEMORY.DMP
This was probably caused by the following module: atikmdag.sys (atikmdag+0x6B274)
Bugcheck code: 0xEA (0xFFFFB08ED7CAB080, 0x0, 0x0, 0x0)
Error: THREAD_STUCK_IN_DEVICE_DRIVER
file path: C:\WINDOWS\System32\DriverStore\FileRepository\c0337288.inf_amd64_3c3211f00f323cb5\B337205\atikmdag.sys
product: ATI Radeon Family
company: Advanced Micro Devices, Inc.
description: ATI Radeon Kernel Mode Driver
Bug check description: This indicates that a thread in a device driver is endlessly spinning.
This appears to be a typical software driver bug and is not likely to be caused by a hardware problem.
A third party driver was identified as the probable root cause of this system error. It is suggested you look for an update for the following driver: atikmdag.sys (ATI Radeon Kernel Mode Driver, Advanced Micro Devices, Inc.).
Google query: atikmdag.sys Advanced Micro Devices, Inc. THREAD_STUCK_IN_DEVICE_DRIVER


On Thu 12/20/2018 8:45:33 PM your computer crashed or a problem was reported
crash dump file: C:\Windows\Minidump\122018-5500-01.dmp
This was probably caused by the following module: atikmpag.sys (0xFFFFF801537804D0)
Bugcheck code: 0x116 (0xFFFFAF86977A4010, 0xFFFFF801537804D0, 0xFFFFFFFFC0000001, 0x3)
Error: VIDEO_TDR_ERROR
file path: C:\WINDOWS\System32\DriverStore\FileRepository\c0337288.inf_amd64_3c3211f00f323cb5\B337205\atikmpag.sys
product: AMD driver
company: Advanced Micro Devices, Inc.
description: AMD multi-vendor Miniport Driver
Bug check description: This indicates that an attempt to reset the display driver and recover from a timeout failed.
A third party driver was identified as the probable root cause of this system error. It is suggested you look for an update for the following driver: atikmpag.sys (AMD multi-vendor Miniport Driver, Advanced Micro Devices, Inc.).
Google query: atikmpag.sys Advanced Micro Devices, Inc. VIDEO_TDR_ERROR


On Thu 12/20/2018 8:29:44 PM your computer crashed or a problem was reported
crash dump file: C:\Windows\Minidump\122018-6015-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x1B1B40)
Bugcheck code: 0x100000EA (0xFFFFCA8CF64CF080, 0x0, 0x0, 0x0)
Error: THREAD_STUCK_IN_DEVICE_DRIVER_M
file path: C:\WINDOWS\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This indicates that a thread in a device driver is endlessly spinning.
This appears to be a typical software driver bug and is not likely to be caused by a hardware problem.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.

 

 

 

 

800MHZ

The strange part about all of this is if I bring the HBM clock down to 800MHZ with wattman and lock the card there it seems to work fine, just not optimal. My main reason for asking these questions and explaining these symptoms across different hardware is to determine if this is a hardware issue with Gigabyte and if I should attempt to RMA the card. I am new to troubleshooting GPU problems and want to be sure its a hardware problem versus some strange driver issue.

 

Also its not just valley bench that crashes it; It has crashed upon opening ffxiv, settings change in overwatch, and heaven bench.

 

Thank you for your time;

Outcomes