Hi, my pc keeps on randomly restarting. In windows event viewer it says Event ID 41 and "The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly."
My specs are
CPU: 5900x
Cooler: Dark Pro 4
GPU: 3080
MB: Asus B550-F Gaming
RAM: G.Skill 32 GB 3600mz
SSD: 1TB 980 pro
Solved! Go to Solution.
Yeah, it’s a faulty cpu. This person from Reddit had the exact situation as I did. The Kernal 41, whew 18 with cache hierarchy, apic is. He tried a fix where he did Ryzen master auto where it does over clocking, etc itself. It worked for him few days but then the problem started happening again. He RMAed the cpu and problem was fixed. I will do the same. Link ( https://www.reddit.com/r/Amd/comments/ot99cu/5950x_whea_error_and_random_restart_fix/?utm_source=sha... )
What PSU do you have make and model?
Hi, my PSU is "220-GA-0850-X1", EVGA SuperNOVA 850 GA, 80 Plus Gold 850W, Fully Modular
And this PC is less than 6 months old
ID 41 is over "Critical" tab. Check as well the "Error" tab on the left, and check what error matches the time of your ID 41 crtical error. Share that if you can.
It seems to be the error "Event ID 18, WHEA-Logger". For the last critical error, I didn't go back into windows I just shut the computer down. Then went to bios and someone told me to turn off C-State. Which I did and so far no more restarts. So I was also wondering if there is any problem with turning off C-State?
C-States enabled throttles your CPU power consumption down when idle by disabling stuff that isn't being used..C-States disabled means your CPU is running full power all the time with all features running all the time whether you are using said features or not.. not saving any power .
Make sure your motherboard's BIOS is up to date and that you are running the latest AM4 chipset drivers from AMD.com
Thanks for posting that info, I have WHEA error 18 as well, it can be caused for many different things. Mine is so far a BIOS issue. Newer BIOS generate WHEA ERRORS for me, I'm forced to use very old BIOS to avoid them. Usually it is related to RAM and Infinity fabric stability when it occurs during high load "during playing games". And it is usually related to power saving features when the crash happens when you go from "load" to "idle" like "closing the game, at the end of a game. When watching youtube videos and you close the browser, etc.
You can open the Whea error by doble click, and check if it is internal bus related or cache hierchary "which tends to be the worse"
For me it says "
A fatal hardware error has occurred.
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 2
The details view of this entry contains further information."
Should I be worried and what do you think I should do? But the acutal actual critical id 41 and WHEA error 18 stopped for me sincne i disabled C-States. No random restart or anything so not sure.
Also, what do you think I can do to fix the restart problem where I don't need to disable C-State anymore?
If disabling C-States really solved your WHEA ERRORS I would keep that disabled.
If it really was C-State related, it may be your motherboard BIOS telling your CPU to use too little voltage for IDLE states. This could either be fixed by some BIOS update, in case you are running some very old BIOS. You could check what BIOS version are you using, and check your motherboard website to check if there are new BIOS. If that does not solve your issue, then having C-State disable isn't a problem at all unless you really care about saving some $ on the bill, it would be minor tho. IT has no impact on the safety of the CPU, hard overclockers usually disable C-State always, because it has an impact on overclock stability overall.
Oh I see, yeah I was worried that it might affect my cpu badly. So I will just keep it disabled. I checked my AMD chipset driver which is at "4.07.13.2243" while from AMD's website for my B550 chipset the latest is "4.11.15.342". So I will update this chipset. My bios is "2803" and build date "4/27/2022". There is a new one being "2806 version". So I might update bios as well. And thank you very much for all the help.
Be aware that in the past, I had BIOS version 2802, which never gave me any issues. I updated to the latest "4408" and I got WHEA ERRORS because for some reason this BIOS make my RAMS unstable if I use the XMP profile, or if I use any speed above 2133mhz. So I downgraded to BIOS version 3603 and WHEA ERRORS are gone. If I were you, I wouldn't update BIOS unless there is a real need to.
I was thinking this as well. Yeah, I won't update the bios since there really isn't any problem and the one problem being restart was fixed with S-State disbaled. Yeah I will just let it be.
PD: Put your answer about C-State as "this solved the issue" so people who are in the same boat as you were, can find it helpful here in the forum!
Yo, hi again. Yeah that C-State didn't fix it and it still restarts but a lot less frequently. It's most likely hardware problem you know any ways to test test like momory,powersupply, etc? To figure out whats the issue?
Thank you for your help.
Can you try setting all RAM settings to AUTO "Let them run at 2133mhz" and report back if the issue is gone.
You can use memtest86 for testing RAMS
OCCT and Prime95 for overall stability.
Yes, right now I am running memtest86 and I will report back the results. Then I will do the prime95 and OCCT.
Alright, I am finally done with first one being Memtest86. It said no errors and the ram are good. I will do the rest of the testing tomorrow. From the test it ran for over 5 hours no restart or anything. So it's definetly not ram because of memtest and no restart. So it has to be between the PSU, MB, CPU, or my SSD. For PSU I remember running the 30 min Power OCCT before and it went through it without restart. But I might do it again just in case. But I don't think it is. Other one besides motherboard and cpu is ssd. I think SSD it's the problem because I use usb flash drive to use memtest and no restart. And I did a fresh install of windows but throguh sfc/DMIS commands said there was corrupt files. And I don't think there should be when I completely did a reinstall from a usb for windows 11. So I want to test the SSD as well and the cpu/MB (I will show all the temps after running prime95/OCCT). Memtest did say temps. Also, all these parts are completely new.
Don't get things wrong, it can be RAM related, that your RAMS aren't faulty doesn't mean that they couldn't be suffering stability issues related to working sideways with other hardware components such as CPU or VGA. I can't tell by your SS at what clock did the test run. If you had your rams set to AUTO in your BIOS then the test was at 2133mhz and not 3600mhz as would be with XMP profile enabled, or manually set. That would also increase your RAM voltage and tighter timings as well. If you test games and other software with RAMS at AUTO "2133mhz" and you don't have reboots. Then enable XMP back on RAMS at 3600mhz now and test again games and software and see if it reboots. That's the best first thing to do to troubleshoot WHEA UNCORRECTABLE ERROR ID 18
Alright will do this testing. I will just run games with xmp enabled and not enabled.
It restarted with xmp so I will do it without xmp. Also, no BSOD or anything just stright up black screen to windows restart. No minidump files as well. I have automatic restart turned of to see if I can get a code but it still does black screen to restart.
You can check event viewer, but you alread shared WHEA errors previously.
Ok, now I disbaled xmp and it did a restart. So it does restart with or without xmp.
Can you tell me if it restarts while gaming "under load" or random even at using browser or windows?
Both, it’s truely random. I was playing a game and a restart. I was watching a YouTube vid in a browser, restart. Just in windows no browser games or anything just idle, restart.
Are you running windows 10 or 11?
Can you please tell me if you have updated your BIOS lately.
Also, could you share some SS of HWinfo64 "sensor only mode" about your CPU voltage and clocks after using your computer a bit try to open a game play 2 min close it. Then take SS's
I also need you to go to event viewer and go to WHEA ERRORS open them one by one, tell me every Processor APIC ID: XX where XX is the number I need, please share all numbers that you got under these WHEA errors, they should somehow rotate between a few, they shouldn't be all different.
Temps and clocks HWINFO
Thanks for posting the info, as we can see in your photos, your Vcore is quite low and makes sense as your peak clocks "effective clocks" aren't going higher than 3.8ghz. Did you disable CPB before doing this testing? that would explain the lower Vcore. If you did not disable CPB then there is something wrong with the voltage, we should see a stock range from 1.1 to 1.5V after playing some games. Your photo shows 1.075V max which is very low. My guess is that you have disabled CPB in BIOS forcing your CPU to work at stock clocks "no boost". I need you to confirm this. If not, then the issue could be your motherboard undervolting your CPU for some reason.
The only APIC ID: XX, numbers that came up are 2 and 0.
Out of total 23 Whea Error, 3 of them are APIC ID: 0
and rest 20 of them are Apic ID: 2
This would help to try a temporal fix, but first I need you to answer if you had CPB enabled or not when you took the photos above.
What’s CPB? But no I don’t think I have it disabled. The only setting I changed are DOCP on and C-States disabled
Core Performance Boost, did you disable that in BIOS?
When in BIOS please take SS of your CPU voltage
One more thing not sure if it affects it but for HWINFO I ran OCCT power rather than a game for 30 min. It put full utilization 100% on both cpu and GPU
Yes, but your Vcore is too low. I don't see 1.075 MAX Vcore as something usual for 5800X 5900X CPU's. If you have Core Performance Boost disabled, then it may make sense to see such a low voltage for your CPU. However, if that's not the case; then something is wrong about the voltage, and could be guilty of your reboots. To make sure, please leave
HWINFO64 sensor mode only running in the background, and open a game, play 2 or 5 min and close it. It will give us the lightload voltage and peak voltage. Also take ss at clock speeds.
After playing a game 3 min my voltage looks like this:
We need a screenshot of your BIOS Voltage values, if set to AUTO as it should, then look at top or corner of your BIOS screen to see the current Voltage in real time and report that back here.
Your power draw seems ok, as you are hiting your CPU default limits. But your Vcore looks wrong.
Here are the SS
It looks good. Please disable PBO Fmax enhancer. Go to CPU-Z in windows and at the bottom click the "TOOLS arrow" and then save REPORT as .TXT open the .TXT file and copy paste the APICs here. It looks like this:
Here
Also, HWInFO64 voltage while running a game. The computer keeps restarting. Doesn’t even stay on for 3-2 minutes. So can’t really get that info.
Previously you posted your APIC ID's that are part of your WHEA error, they point to Core 0 and Core 2. Go into bios, go to Curve Optimizer that is in one of the photos you shared. Set it to ADVANCED. In magnitud set + positive magnitude PER CORE. Set 10 to core 0 and 10 to core 2. The rest let them on 0 "no change". Save and go into windows, test a game and report back if it crashes or if at least, last stand longer without crashing.