Computer Type: Desktop
GPU: Radeon RX 5700XT
CPU: Ryzen 5 3600
Motherboard: MSI B450 A Pro Max
RAM: GSkill Ripjaws 8GB X2 (16GB in total)
PSU: Thermaltake Smart RGB 700W
Case: Midtower with 1 stock fan
Operating System & Version: Windows 10 Pro Version 10.0.19041
GPU Drivers: Radeon Software (Adrenaline) 20.4.2
Chipset Drivers: AMD Chipset Software 220.127.116.112
Hard Disk: SSD - Crucial 1TB M2 Nvme
Background Applications: Happens irrespective of what applications running
Description of Original Problem: My newly built PC keeps on restarting randomly. Sometimes, it will run for 6-10 hours without any issue. Then other times it will simply restart when I open an application (browser, tabs, etc.) or games and sometimes it just restarts at its will. Every time it restarts, the event logger logs the below errror:
"A fatal hardware error has occurred.
Reported by component: Processor Core Error Source: Machine Check Exception Error Type: Cache Hierarchy Error Processor APIC ID: 11
The details view of this entry contains further information."
Troubleshooting: I have updated all the drivers. Deleted and reinstalled and updated all the drivers. Checked if the CPU fan is properly attached to the CPU, if GPU is properly attached, if RAMs are properly attached, and everything else. All of them seem perfectly fitted. Used various software to test CPU, GPU, RAM, etc. All came back with good results. Did memory test and DISM test. Both were successful without any error.
@Cmdr-ZiN OK, hopefully I got the reply/tagging parts right!
I had the same DOCP at 3600mhz and I am using 16gb ballisticx e-die RAM. It initially posted and I had no issues with the RAM. I also have NVME drives, but I believe it isn't Gen 4 (HP EX950 Gen3 x4, NVMe1.3). We have similar types of builds, it seems.
I had a couple days where it would fail about 1-3x/hour and after I updated the BIOS to the beta 3801 version, it calmed down a bit. No restarts yesterday, but 2 already in the past 30 minutes today.
I also removed the network cable, then restarted the computer, went to desktop, shut it down, and then reinsterted the network cable. It booted up fine, but then also randomly restarted after about 10 minutes. Anytime the computer has less work to do and goes into idle, I run the risk of a restart. It's odd.
@liquidwater yep we're pretty much all in the same boat. I have seen this affecting Intel systems and older systems, this has happened before. Problem is they could all have different causes what solves it for me might not solve it for you, however considering similar cirmcumstances it's possible there's something similar affecting us.
All we know is we're getting random power cuts or crashes, some have replaced their CPU and fixed it, but did it reoccur and they didn't let us know? I've seen for some a CPU replacement hasn't fixed it.
I've yet to pin down a common variable.
@Cmdr-ZiN - What I know for sure, is the fact that CPB and PBO disabled, "solves" the problem. This is our common variable. What drives me crazy is AMD silence about this.
We have an ocean of threads and reports about these WHEA crashes and nothing is said.
@Electric_Squall How do we make sure that CPB and PBO are both disabled? Is it through Ryzen Master? I only just downloaded that to check my temperatures, but haven't made any changes to it.
you can turn off core performance boost, and PBO in your bios
itll be brand specific to how its going to look.
but should be under CPU overclocking
or in "advanced" amd overclocking.
i also have the WHEA 18 error and restart problem on my 3700X..
setting i use to "fix" the problem ..
1. disable C-stage
2. set PCIE to 3.0
3. Power supply set to Typical Idle.
other all auto with CPB and PBO all enable.
i wonder the CPU manufacture date have any to do with it.. so my CPU is made at 2019 weel 48 , do you guys know what yours? i just upgrade my system with this 3700X on March 2021 .. i no know why i get this "OLD" ryzeen 7 3700X stock that make at 2019 Nov....
MSI Tomahawk wifi V15 bios
3700X with Deepcool AS500 plus Cooler
kingston Hyper x 16GB X2 3600
Radeon R9 Fury X
Corsair MP510 nvme SSD
Seasonic gold 750X PSU
Nothing to do with date, since this error is crossgen, Zen 2 and Zen 3 have it, but Zen 3 is accentuated.
@dskhury it's a completely generic error. You should search Event ID 6008, as you get this everytime Windows fails to shutdown correctly, you don't always get the WHEA error.
Anything can cause a sudden powercut, I've seen similar on Intel, but there was an issue very similar with the Ryzen 3000 series fixed with a BIOS update.
I spoke to the store I bought mine off they'd have to reproduce the issue to replace it and I'm not sure I can reproduce it in a week every time. They hadn't had any returns for WHEA errors or random restarts, they listed of the reasons they had for returns failure to post and rank B memory failures for the most part. However I suspect this issue would go undiagnosed a lot or people find it easier to RMA with AMD.
I was trying to work out if I could reproduce it realiably. I was running memtest86+ the first run had no errors and was almost complete but the machine had power cut before I came back and read the results. This means it's not Windows related. I ran the test 2 more times, fine both times, left it on the results screen overnight to see if it would power cut again but it didn't.
Booted back into Windows but my secondary HDD was fried. If you're testing this issue might be an idea to unplug your HDDs.
I do wonder if a dying HDD was the cause all along, rather than a victim of a Dodgy CPU, MOBO or PSU.
I don't know what happened over the past two days but my computer has just been restarting like crazy. Sometimes during the restart process before at the login screen, it'll just restart again.
My PBO was set on auto. As I understand it when it's set, it's actually disabled? In any case, I did send it to disabled.
Now I have to find that power supply typical power setting which I've been unable to locate in the bios.
turns out that the Asus bios settings has a search feature so it was really easy to find pbo, cbp, and the power supply typical functions. I have disabled all of them as recommended in the threads. Thus far, after just recently disabling the power supply function, I have not even been able to load into desktop yet. I got close one time but then the computer restarted by itself. This is getting really frustrating and I think my only recourse at this time probably would be an RMA for the processor.
I'm going to disable the power supply setting for typical. I think that has made things worse. I haven't been able to get to desktop and stay in desktop for more than 5 to 10 seconds before the computer restarts. This is unusable. Not to mention, it takes two or three attempts for the computer just to even get to the Windows login page.
Let's see what happens when I set the setting back to Auto from typical for the power supply.
I don't even think I could set up an RMA on this computer because by the time I even get to the RMA web page to fill in the information on AMD site, the computer will have restarted by then.
I got an error when restarting one of the times about Windows not being able to boot up properly to use Windows from Volume 2 or Volume 6. Not sure how I got two volumes of Windows? I have two NVME drives, and I did use Acronis to clone my original NVME (1tb) and put it on my new NVME (2tb). Guess that created two volumes of it? I am now using a different volume and so far, computer seems to be OK, although I do get a generic Windows error when booting up the computer. Granted, I haven't even used the computer for more than 10 minutes lol.
Edit: I spoke too soon and it restarted again.
I was getting so frustrated that I just set up the computer to reset windows so I could wipe everything and start again.
Computer restarted due to whea in the middle of it so I couldn't even wipe my computer from the built in tool (windows recovery blue screen tool).
I had similar issues. Couldn't even reinstall Windows 10 due to a BSODs during the reinstall process.
I was getting everything from WHEA Errors, Kernel Mode Trap, errors related to nfts.sys, netio.sys, DPC_WATCHDOG_VIOLATION, KERNEL_SECURITY_CHECK_FAILURE, KERNEL_AUTO-BOOST_INVALID_LOCK_RELEASE (what failed, win32kbase.sys).
System had previously ran perfectly since 7th Sept 2019 until around Dec 2020 or Jan 2021.
Two and a bit weeks ago I RMA'd and got my replacement 3 days ago. System has been perfectly stable since. running on Optimized Settings in BIOS and XMP enabled.
If it was me, I'd find a way to RMA, even if you have to do the form on your phone or tablet.
Out of curiosity. What batch number have people got with their flaky Ryzens?
My problematic Ryzen was:
(manufactured 20th week of 2019)
My Ryzen 7 3700X is :
what is the new batch number of the Ryzen 5 3600 you get form the RMA?
the new Ryzen is from batch:
14th Week (early april) of 2021
MeuMy R9 5900X processor is the BG2112SUS.
Moving on to the staff to verify theirs.
@liquidwater - Sorry for the long late reply, my friend. I was very busy last week.
This is how I disable both of them:
Hi I am a new user. These days I'm experiencing random reboots triggered by Whea LOGGER 18 error as well (preceded by a whea logger 19 warning, corrected hw error).
I first got this error when I changed CPU, I had a R5 3600 and got a 5600X. After disabling PBO and set manual cpu multiplier at stock boost frequency and auto voltages my cpu ran fine (not exceeding 1,28V) for some months, even when i changed GPU (from a 5700xt to a 6700xt) and PSU (from a Corsair 600W 80+bronze to a full modular seasonic fx 750W). So when new gpu driver released 21.5.2 i installed it and one day the PC fell into the windows auto repair loop. After restoring my system I only lost programs on C that I quickly reinstalled and everything seemed to run pretty fine. After some days I had again whea logger 18 crashes preceded by whea 19 warnings (with the same RAM + CPU config that was running fine before windows restoring). Reading ALL of the comments on this topic (and others) I did the following:
-Uninstalled and reinstalled latest Chipset drivers
-Updated graphic drivers
-Uninstalled Soundblaster drivers (i hear a crackling sound in my headphone sometimes when the system is about to randomly reboot)
-Uninstalled Afterburner, Hwinfo and rivaturner
-Changed Windows power plan
-Tried again PBO with curve optimizer negative 20.
-Enhanced memories voltage
-disabled spread spectrum feature in BIOS
-Set typical current idle in BIOS
-Disabled Resize Bar and above 4g decoding (SAM )
-Set PCI_E to gen3
-Enhanced CPU voltage overriding around 1,29V
I still have WHEAs.
At this moment the only things left to do are changing RAM frequency with new timings, disable again CPB (now on Auto) and PBO and do another windows clean install. Maybe iI can also change a CPU power cable on the mobo in the process (just to be sure).
If nothing of these last actions will fix the problem i really wouldn't know what else to do.
Hey, this is very helpful! I ended up finding it by doing a search within the bios. Didn't even know that was a possibility since I was completely failing to locate those within the menus.
The plan now is actually to send the CPU to amd. That way they can take a look at it and hopefully give me a new processor. Unfortunately I did not keep the clamshell so now I have to try to see if I can find a clamshell at a computer shop or maybe even buy one.
Even getting the MSinfo32 and dxdiag information that AMD wanted as part of their RMA process was such a hassle because before the complete scans would be done, the computer would restart.
For anyone dealing with this issue that also runs an AMD GPU on 2020 drivers, if you use the system for cryptocurrency mining(particularly nicehash) after the next crash avoid running nicehash and see if the issue returns.
That and if you have HWinfo64 'launch at startup' might be worth disabling & only running for short periods,, as the conjunction of both of the mentioned apps was causing frequent cache hierarchy errors for me in early May on R5 3600 + RX6700XT.
@Electric_Squall I've tried with both off and still got the issue, sure it took a week running on idle but I still got it. What works for one may not work for all.
I doubt it's any kind of cover up, probably just less widespread or not on their radar or a bunch of issues they haven't yet narrowed down.
However in all my years I've not ever seen such an ellusive issue.
this goes out to "ALL" with ANY WHEA 18 "apcid x (which is a core/thread) crashing
this is 100% cpu voltage. you can bump the core voltage up on tick-offset, and test
running this (core cycler) tool well help tell you what cores are stable or not.
(hard to believe) that a 5600x capable of running 2000fclk 4000mhz 100% stable daily 4x8 3200c14 bdie sticks
has issues with the "best two cores" in the system.
Just want to say that craxton may be correct here, just ran the core cycler tool and indeed one of my cores fails after some iterations. Pretty easy to reproduce, which is way better than just "random crashes" at least now I have a way to force something to happen.
suspect the afterburner causing crash.....
Hey guys, i've just bought a new motherboard/cpu/ram and i've been having this issue as well. I got some dump files if someone who is experienced can read them and identify the problem from them that would be great.Im also getting unusually high temps ( can reach 65 idle and 90 while gaming on warzone ) which i've bought an aftermarket cooler to try and fix.Im thinking of sending the CPU back but i wanna make sure its not the motherboard or the ram causing the issue.
RAM:16gb ddr4 g.skill ripjaws 3200mhz
@comp4cty After contacting AMD tech support, below is the full list of troubleshooting they advised me to try, For me it was Power Supply Idle Current control maybe one of the troubleshooting steps will work for you:
Update the system BIOS to latest version available from motherboard manufacturer (refer to motherboard user manual for instructions on updating the BIOS).
Set the BIOS to use factory default settings / optimized default settings (refer to motherboard user manual for instructions on restoring BIOS default settings).
In the BIOS, locate the Power Supply Idle Control option and set it to Typical (this option should be available in the Advanced section of the BIOS).
Update Windows to the latest version and build via Windows Update. For instructions, refer to article.
Update to latest chipset driver from AMD. For instructions, refer to article.
In Windows Control Panel, select Power Options and choose the Balanced (recommended) power plan. In Windows Settings, select Power & sleep and set the Performance and Energy slider to the middle.
Disable non-Microsoft services and startup items using the System Configuration Tool.
Reseat CPU, RAM, and all PSU power connections (end-to-end for modular PSUs). For more instructions, refer the product’s user manual.
Verify RAM sticks are installed in the correct DIMM slots (for socket AM4 motherboards with 4 DIMM slots, use A2 & B2).
Contact AMD if you're still having issues.
Can anyone suffering from WHEA errors try something for me, based on a hunch? In the advanced power settings, could you change the minimum processor power to 10%, or something higher than 5%?
Settings > System > Power & Sleep > Additional Power Settings > Change Plan Settings > Change Advanced Power Settings > Minimum Processor State. Set this to 10%
Im unsure who all has responded to me, and unsure who all has actually took a moment to even read what was said.
if your chip on STOCK settings has WHEA 18 crashes then you SHOULD RMA that chip, but IF your crashing while having a 200mhz AUTO OC under PBO then thats normal and not a "guarantee" that youll hit those frequency ranges with your chip.
personally, figured out i have one core causing all my stress, which is my WORST core in the 5600x system i have.
can run all cores with -30 co +50mv offset fine, but no core cycler passings of most cores 3 pass, 3 fail.
dialing down to -20 with the same offset only 1 fails. which is what CTR, and the (test) inside core cycler states are my worst cores, to which needs a positive offset.... anyhow RUN A POSITIVE .02xx offset to core voltage without turning off CPPC-DF CSTATES etc.
the processor needs these to "idle" otherwise its constantly in power mode using more and more voltage than it should. telling people to turn off these things, isnt the best idea. DF states sure, but on most boards theyre inside the same name however are different. CPPC preferred cores LEAVE IT ON!!!
turn off any auto overboost, turn off XFR ENHANCE (forget what board has it) and set a positive offset to voltage (50mv is .0500v) so dont set .50 thats WAY to much lol) also, dont use curve settings for 5000 series chips as if your crashing while PBO is on, then this is why. (it can also be your IOD, CCD voltages, and CLDO VDDP, CPU VDDP voltages causing random crashings. with newest AGESA code, CPU VDDP should be automatically set to .900 which can be seen inside ryzen master granted you have the proper setting turned on to view it wont be able to change it inside ryzen master, and MSI boards you can change CPU_VDDP period!!! (other boards sure can) but leave it at 900mv its best set there.
now, for those crashing before getting into windows, i do believe an RMA is what you should do. as thats not something voltage should be thrown at. granted you havent ran an ALL-CORE overclock for a year now at 1.45V and are now getting these crashes. WHEA-18 is voltage/CPU crashing, the ACPI number should tell what core is crashing to be frank about it. WHEA-19 is related 80% to ram overclock settings running you FCLK to high or unstable timings or IOD, CCD, etc being off by some margin.
again, its been confirmed by LOADS of people (who know whats coded inside the AGESA code etc) people somewhat like URI (1usmus) stating whea 18 is core-voltage related. give that mv offset a shot, turning off any auto boost settings, leaving pbo on motherboard, setting 2x scaler for good margin and see what you get. LEAVE CPPC-PREFERRED CORES on! im sure you dont want your chip running constantly and not entering power-down mode? especially when your simply using chrome, the other cores will stay active instead....
@craxton I've been running mine at stock settings, however stock has PBO set to auto, but from what I read when PBO is set to auto then it's controlled by Ryzen Master and if it's not overclocked in Ryzen Master then it's off. I've never installed Ryzen Master so I assume it's off, would that be correct?
My Processor APIC ID is mostly 0 but also several other random ones.
I haven't had WHEA errors since May, pretty much running the PC constantly testing at idle for the most part which was when I was experiencing the issues. I also get event 6008 Windows didn't shutdown correctly and Event 41 Kernel Power, system stopped responding, crashed or lost power unexpectedly.
While I haven't been getting WHEA errors for a while, I've been still getting event 6008 and event 41 crashes. However I've not been able to reproduce even these since AGESA 18.104.22.168a, I hope it stays that way.
You seem to know your way around the event logs and what they mean is there anything I should be looking for?
Crashed again after 7 days
Time to test the CPU in another machine.
So far so good after my CPU RMA! Only had it for a few days, but not a single random restart yet.
@liquidwater still too early to tell but I've been on a new 5800X for just over 4 days no issues. This one was made in Malaysia not China.
I've got a good feeling about it by the way it loads the cores but I think I'd need to leave it on for a couple of weeks straight before I come close to calling it solved.
just some update to my "WHEA Logger Event ID 18" problem.
turn out my restart and WHEA Logger Event ID 18 problem is cause by ASUS Xonar D2X sound card...
i take the sound card out (the only pcie add on card on my system other than RX 460 graphic card), Clear CMOS and load default setting in BIOS, clean install windows with all the latest AMD chipset , VGA ,LAN, WIFI,BT driver also windows update. system run stable without any problem. (no restart when fast skipping youtube video.)
then i put the ASUS Xonar D2X sound card in , when install the ASUS beta driver, the system restart in the middle of the driver installation. so i change the sound card to other PCIE slot then driver install all fine. but the system restart and WHEA -18 problem return. ( when fast skipping youtube video.)
So now my system running fine without the Xonar D2X . (XMP on 3600, PBO enable).
Read the whole 24 pages and this is insane. I'm having the same issue where the system reboots randomly when gaming.
Currently using a Ryzen 5 3600 with a xfx 5700XT and a x570 ASUS MOBO
Only change so far is that I moved the GPU from pcie slot that was far down to the closest to the CPU since it's suppose to work better and crashes got worse.
Currently using 2 16GB 2400 mhz RAMS don't know if that may cause it.
Not overclocking anything at all.
I have the WHEA 18 error. I ahve WHA dumps and minidumps from Windows if anyone is so kind to look at them. Having it RMA is really hard here on Argentina
Just an update, after working perfect for a couple of months, my second CPU failed and would post but not boot Windows or Linux.
I'm on a third 5800X it crashed within 24hrs and then again the next day.
I was on the latest BIOS, I updated to the latest chipset drivers and turned off PBO by setting it to disabled. Setting PBO to disabled should be the same as leaving it on auto and not enabling it in Ryzen master from what I've read.
Mine has been stable for over 5 days but that doesn't mean the issue is gone.
@FacundoG any RAM below 3200Mhz should be fine. It's hard to know what is going on I haven't found anyone that has gotten to the bottem of it. Many people aren't affected by it though.
I'd start by clearing your CMOS and making sure your BIOS, Chipset and Windows are up to date.
You could try disabling PBO, that didn't make any difference with my first CPU but worth a shot.
You might need to RMA in the end, Good Luck.
I actually thought that I needed faster RAM.
I have last drivers and just updated to a June BIOS since I had a 2020 one. Hopefully that'll help.
I'm also installing newest chipset. Is the CMOS necessary? never done it and would prefer to not break anything lol
@FacundoG CMOS reset just resets all your BIOS settings to default. It can help troubleshoot BIOS issues. Sometimes there will be hidden changes you can't see and the best way to fix that and start back like it's a brand new is a CMOS reset.
There's really no risk if you did the initial setup in the first place, because you'll just have to do so again. Take some photos of your BIOS screens if you're not sure what settings you might need to set back. Mostly it will just work on default settings but check your boot settings, UEFI, CMS and secure boot settings and memory timings. Just save them for later reference.
Is there a known fix for this? I'm experiencing the same issue with smss.exe when locking/unlocking my PC.
@nuGeorge - Yes. Fixed clock. Unfortunately, no workarounds to avoid the error while using the boost features.