Symptom:
Random reboot that ALWAYS occur in 2D graphics with sound and video on in the EDGE browser.
Event Viewer report (NO BSOD just a reboot of system):
The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
EDIT (found GUID in registry) :
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\WINEVT\Channels\Microsoft-Windows-Kernel-Power/Diagnostic
| | | [ Name] | Microsoft-Windows-Kernel-Power |
| | | [ Guid] | {331c3b3a-2005-44c2-ac5e-77220c37d6b4} |
|
| | | Keywords | 0x8000400000000002 |
|
| | | | | [ SystemTime] | 2021-05-09T12:53:08.0082874Z |
|
| | ConnectedStandbyInProgress | false |
| | SystemSleepTransitionsToOn | 148 |
| | CsEntryScenarioInstanceId | 0 |
| | CsEntryScenarioInstanceIdV2 | 0 |
| | LongPowerButtonPressDetected | false |
System Specs:
ASUS TUF GAMING X570 Pro Wifi with BIOS 3801 -------< SUSPECT AGESA and I will roll back to 3062 3/12/2021 before 5000 series updates
64GB of corsair ram model CMW32GX4M2Z3200C1600(Corsair Vengeance RGB Pro 32GB (2x16GB) DDR4 3200 (PC4-25600) C16 AMD Optimized Memory)
RAM installed has RGB and fan cooler
EVGA SuperNova G3 Gold 1000W PSU
Nvidia Asus Geforce Rog Strix 1070Ti 8GB GPU driver version 466.27
Sabrent Rocket PCI-E Gen 4 1TB with large heatsink
Corsair iCUE H150i RGB 360mm water cooler
Creative Soundblaster AE5
Logitech Z906 Surround Sound THX-Certified 5.1 Speaker System
Corsair K95 Platinum XT RGB keyboard
Corsair Dark Pro RGB wireless mouse
Western Digital RED 6TB backup and data storage drive
Microsoft Windows 10 Professional version 20H2
AMD Chipset driver version 2.13.27.501
CyberPower CP850AVRLCD Intelligent LCD UPS System, 850VA/510W, 9 Outlets, AVR, Mini-Tower, Black
-------------------------------------------------------------------------------------------------------------------------
Summary of problem and attempts to resolve:
Here has been what I've done to locate the problem and believed to be the CPU causing it.
1) Replaced the NVme with a PCI-E 4 upgrade
2) Corsair ram rated for 3200. Changed from DOCP and it ran defaulted at 2600.
3) Replaced and upgraded the motherboard from an X570 Asus Wifi PLUS Wifi to Asus X570 Wifi PRO Wifi.
4) Flashed and upgraded both motherboard bios to latest versions.
5) The power supply as I have two was swapped between G2 750 PSU and G3 1000 PSU
6) Reinstalled Windows in all cases
7) upgraded water cooler from Cooler master 270mm to Corsair 320mm so it's not heat related at all
Everything is run at defaults in the BIOS and nothing is overclocked manually
9) All drivers , including chipset, are installed and recent in all cases
The only thing NOT replaced is the CPU itself and a 3900XT is not cheap so it's the last resort. I like to do video processing, gaming, and the system definitely can do it all and is well prepared for years to come! But the CPU or AMD's AGESA PI more than likely took a dump.
The windows event error appears to be spot on with the problem from the time it happened. It has occurred more frequently now. Finding it was not cheap either so I'm a bit ticked off.
The CPU is being RMA'd if the flash back to AGESA version V2 1.2.0.1 before the USB fix was put in for the 5000 series. I posted it for the benefit of others and so AMD can see the system setup and what was done to attempt to fix it.
<<<<SUSPECTED ISSUE WITH PI 1.2.0.2 fix for USB issues and 5000 series and you have installed a 3000 series>>>>
Version 3801 Beta Version
2021/04/09 21.01 MBytes
TUF GAMING X570-PRO (WI-FI) BIOS 3801
"- Update AMD AM4 AGESA V2 PI 1.2.0.2
- Fix USB connectivity issue
It has taken roughly a month to isolate it and upgrading parts. It was worth it going from PCI-E 3 Nvme to 4 anyway. I will be giving support my displeasure on how far I've had to go to isolate the CPU or BIOS as the issue. But before I do this RMA I 'm going to run the previous version of the BIOS 3602 March 12th,2021 with AGESA V2 PI 1.2.0.1 after researching this more and will report back if it does it again. I'm not rude and it was my choice to spend the additional funds in the end but I wanted it found and out of my computer. Others experiencing this must be infuriated and frustrated!
NOTE:
AMD mentions when you turn DOCP on that it could effect PCI-E Gen 4 devices.
I can tell you it does not effect the Sabrent Rocket. The swap occurred BEFORE and AFTER with a Crucial P1 1TB NvME PCI-E Gen 3 and they both do the reboot and hiearchy cache problem in BOTH cases. DOCP on or off does not matter what Gen 3 or 4 you run on the Sabrent has made no impact.
NOTE 2:
I've learned that HWinfo may be also be causing it so I'll be uninstalling it and reporting this back.