A couple of users and myself have been suffering sudden reboots with our computers composed of Ryzen CPU systems (Ryzen 3000, but especially 5000) under different load conditions. The quickest way for us to trigger it, however, has been by using software designed to test RAM stability such as TM5 or Karhu RAM Test.
We have recently discovered that this problem only occurs if we have HWiNFO loaded in the background on Windows 10. Most of us also have AMD Radeon graphics cards, but we yet have to determine if that is a contributing factor. We don't know exactly where the conflict is, but the pattern is clear: we see the dreaded WHEA-Logger Event ID XX Cache Hierarchy error in the Event Viewer of Windows after those sudden reboots.
This has been tested by multiple users across different setups: motherboard manufacturers, AGESA/BIOS revisions, RAM brands and configurations, settings, and even after a fresh install of Windows (including different versions of the operating system). The only common denominator we have been able to find this far is the use of HWiNFO (we've only tested this using the latest versions - we still don't know if it can be solved by rolling back to a previous version specifically).
I'm sharing this information here with the hope that this problem can be reproduced and fixed accordingly. Perhaps, it will also require collaboration with the team behind HWiNFO. For that reason, I have also created another thread in their support forum: https://www.hwinfo.com/forum/threads/is-hwinfo-causing-the-dreaded-whea-logger-event-id-xx-cache-hie...
If anyone else is suffering from this problem an can reproduce it, please chime in and let us know. The more feedback and information we can gather, the better.
Thank you all for your time.
No - I don't think it's related
I can easily reproduce WHEA error and reboot without HWInfo running in backgroud.
Well, take a look at the link to the HWiNFO support forums I shared in OP, where a group of users and myself are looking for the issue with the creator of the tool. So far it seems to be related to the GPU sensors of our graphics cards (Navi 21) and we're currently testing a new BETA.
Anyone interested, please take a look here: https://www.hwinfo.com/forum/threads/is-hwinfo-causing-the-dreaded-whea-logger-event-id-xx-cache-hie...
I have Asus Dark Hero motherboard with Ryzen 5 3600, GSKILL 4x16GB@3600 and RX5700XT and this happened on the very first boot the PC feeezed and I have to unplug it in order to get it to work. Once I've installed windows I get some random WHEA errors while PC is in idle.
I've decided to update to latest BIOS and intermediately after BIOS flash was finished the PC was restarted automatically few times and after that the screen went black and the MB gives me 00 error code which means CPU error according to Asus web...
For now I,ve set PBO to defaults and RAM voltage to Auto instead 1.35 which is the default with XMP loaded to 3600Mhz
I don't use hwinfo, I don't even use a radeon vcard and I continue to have bsod with whea errors, all the people I have seen with these problems have different memory, mobo, bios version programs, psu, etc, the only common factor is a series processor 5000x...
Aha, it's an AMD GPU issue. Man, these WHEA errors are a pain in the butt. I have Nvidia so didn't see it.
Yep, that ended up being the handicap here. Fixed now with the latest HWiNFO beta, though 🙂
jackalito, good job finding this issue with WHEA and GPUs! It's really awesome. I hope our CPU related issues will be resolved soon also.
Thank you, Ivan!
Hoping the rest of the issues get ironed out soon with newer AGESA/BIOS firmware revisions and/or chipset drivers.