Mainboard: MSI x570 Unify
Mainboard-BIOS: 7C35vA82 (Beta version)
CPU: Ryzen 5900x
RAM: Crucial Ballistix BL2K32G36C16U4B 3600 MHz, 64GB (32GB x2)
Drive: M.2 Samsung 970 Evo+ 1TB SSD
Graphics: SAPPHIRE Nitro+ Radeon RX 5700 XT
PSU: be quiet straight power 11 750w Platinum
OS: Win 10 Pro (64bit) - all updates installed
Chipset driver: 2.9.28.509 (released 2020-11-09)
I first assembled the PC with a Ryzen 3800x a week ago because it was unclear if and when I would get the Ryzen 5900x I ordered. Worked with the included AMD Prism Wrath CPU cooler for one week without any problems.
- Today I installed a Ryzen 5900x and a Scythe Fuma 2 CPU cooler.
- After 20 min the first crash/restart with the following entries in the Event Viewer: WHEA-Logger ID 18 and critical error Kernel-Power ID 41.
- Happens irregularly again and again, sometimes after minutes, sometimes longer: Windows freezes for a few seconds and then the PC reboots. Doesn't matter if load or not.
- CPU temperature between 30 and 40 °C
- Updated to BIOS and chipset driver mentioned above: Problem still exists
- XMP Profile disabled (RAM on 2600 MHz): problem still exists
- CMOS Reset: Problem still exists
Either there is a compatibility problem of something with the CPU, or the CPU is defective?
What to do? Really frustrating.
Solved! Go to Solution.
Im having a similar issue, x570 aorus and 5600x. Have same errors on windows.
Disable CBP and PBO and run it at default settings (3.7 ghz and xmp on). That works for me.
I got a new angle on this. So deactivating PBO and CBS definetely works, PC was running stable for a week now. But you'll loose performance.
So I wrote to the MSI support and the AMD support.
MSI suggested to try increasing the DRAM Voltage by 0.05 V, which I did. System seems to be stable, no crashes so far - neither in idle or while gaming.
That BIOS is the 3.40: Update AMD AGESA Combo-AM4 V2 1.1.0.0 ? Correct? Because that Agesa version is what this OP needs to know. When I started posting to this topic, knowing that you were not installing directly to the M.2 would have been handy info.:smileyhappy:
@mackbolan777 wrote:That BIOS is the 3.40: Update AMD AGESA Combo-AM4 V2 1.1.0.0 ? Correct? Because that Agesa version is what this OP needs to know. When I started posting to this topic, knowing that you were not installing directly to the M.2 would have been handy info.:smileyhappy:
so im using 3.40: AMD AGESA Combo-AM4 V2 1.0.8.0. The on you have listed is 3.60. which is also a non beta one. im too scared now to try anything else.... hahahaha like you say in your sign off. it worked before i broke it!
Yeah the m.2 installation was new to me so i thought it would be kinda like a normal install. first time ive owned one. sorted that out quickly enough though. :)
Well, I went back way back till the very first 1.0.8.0.. and a few others after.
I'm pretty screwed. I've ran out of ideas. It's either the mobo or the CPU. Will be bringing it to my retailer for further troubleshooting :(
What i can say is i have swaped a b550e from asus to a x570 hero viii and the pc keeps rebooting. So i sent the cpu back for a refund, and bought another 5900x. The new one should arrive today and i will give an update.
I was going crazy for hours trying to figure out something. I stumbled upon this post:
https://www.reddit.com/r/AMDHelp/comments/jzfgj1/ryzen_9_5900x_random_crashes_with_whea/
Spread Spectrum Control-->Disabled VCORE SOC--> 1.1V CPU VDD18--> 1.96V AMD Quiet Cool-->Disabled Global C-state Control-->Disabled CPU Vcore Loadline Calibration--> Turbo Vcore SOC Loadline Calibration--> Turbo Precision Boost Overdrive--> Manual PPT Limit--> 666 TDC Limit--> 666 EDC Limit--> 666 Precision Boost Overdrive Scaler-->Manual Customized Precision Boost Overdrive Scaler-->10x
Pasted the settings which I meddled with and removed those unavailable to Asus mobos.
It finally worked! CPB is on, XMP is on with RAM running at 3600. Ran Cinebench and all-core frequencies went up to 4Ghz (probably not the best?) but single core boost goes up at 4.7-4.8Ghz as expected.
Now really what is the problem? I'm thinking some mobos aren't "strong enough" to provide enough power for some CPUs, thus the manual voltage settings required. Pardon my layman thoughts...
So far, I've ran multiple Conebench multi & single core tests at about 30mins at a go with max temps at 76 deg. I know.. it may not be convincing, but without this tweak, I could not even install Windows. I will continue to monitor this..
Please do continue to share. Cheers.
Sorry guys.
Just ignore my previous rant. Tested the PC enough till the reboots came back. And it doesnt happen during benchmarking. Only during trivial tasks.
I'm just gonna disable boost for now and find time to bring it to my retailer :(
AMD has today released a new Bios so check your manufacturers website for support.
ASRock has it now available called: Update AMD AGESA Combo-AM4 V2 1.1.0.0 patch D. bios 3.80. I'm too afraid to try it as my PC is still running extremely well since I flashed back to 3.40 and clean installed windows.
it does mention stability issues resolved but you dont find a lot of information as to what they have done. Twitter says:
AMD has released AGESA 1.1.0.0 Patch D to motherboard partners for the Ryzen 5000 Series. BIOSes begin in January.
Updates
System stability improvements
Gigabyte already released 1.1.0.0 D this past weeked. So far so good.
@cashby wrote:AMD has today released a new Bios so check your manufacturers website for support.
ASRock has it now available called: Update AMD AGESA Combo-AM4 V2 1.1.0.0 patch D. bios 3.80. I'm too afraid to try it as my PC is still running extremely well since I flashed back to 3.40 and clean installed windows.
it does mention stability issues resolved but you dont find a lot of information as to what they have done. Twitter says:
AMD has released AGESA 1.1.0.0 Patch D to motherboard partners for the Ryzen 5000 Series. BIOSes begin in January.
Updates
New Curve Optimizer OC feature enabledSupport for Ryzen 5000 Series on 400 Series mobosSystem stability improvements
Updated new bios (Asus 3001 / x570/3900x/4*8 3600 XMP on) and RANDOM RESTART still exist.
AND NEW BIOS DONT ALLOWED TO ROLLBACK old bios.
DO NOT UPDATE
I posted this in another thread and found a temporary solution that gives me a completely stable system with no constant WHEA crashes. It uses the BIOS curve optimizer and I would be curious if this helps other people. I know for sure it has worked great so far for one other person.
I am now able to run my 5900x with XMP enabled and 3800 RAM and 1900 Fabric. My performance is pretty good, if slightly less than review benchmarks. I can hit 4300Mhz all core and 4700Mhz single core, slightly less than promised. But this is at least better than having to disable PBO and getting $100 budget CPU performance.
I see this as a temporary fix, since I still don't have a stable system at default BIOS settings which is not acceptable long term. However, I think it's helpful to know if this works for others and would show that we are all having the same problem (or not).
Note: This fix relies on the new BIOS feature in PBO2 called Curve Optimizer. Only recent BIOS releases in the last month will have this, and it may not be out for all motherboards. It's supposed to be a "smart" overclock that allows you to bump up or down voltage, but based on power needs. It's easier to use than figuring out specific voltage settings to use, as you set a relative offset number compared to default, and not an absolute value.
Fix:
1. Set all overclock features to default.
2. Google curve optimizer for your motherboard to see how to enable it in your BIOS. For me I had to go into AMD Overclocking and enabled "advanced" PBO mode.
3. Set curve to apply to all cores
4. Start at a conservative value, like positive 2, and go up from there. The scale goes up to +30. Each step up in interval (AMD calls it magnitude) is equivalent to around 3-7mV at the lower end, but this scale range changes the higher you go up. That's why AMD calls this a "magnitude" step and doesn't give voltage values.
I was able to have a mostly stable system at +6. It did crash after a day. At +8, it has been stable for almost a week now without one crash in a variety of tasks (gaming, idle, web browser, all core benchmark, single core benchmark). I also set it to +10, which was also stable but you want to stay at the minimum voltage that gives you a stable system. If you don't have ANY improvements at all in stability as you go higher from +2 to +8, then you may be having a different problem than me and I wouldn't go higher.
It is probably CPU frequency boost and variation causing enough noise to cause DRAM errors and PC Crash.
I find it better and more stable to run my PCs with a fixed manual CPU overclock and BIOS voltages adjusted.
I run the DRAM at the fastest rated speed possible for the motherboard.
I turn all of that AMD "CPU Boost" nonsense off, because of the crashing it caused.
That is my experience with it all.
For ASUS X570 MB owners, they just released a new non BETA BIOS (version 3001). This could be the solution.
Unfortunately, I had to return the CPU due to my frustration and loss of hours before I can try this. So far, few users in Reddit have some positive results after the update.
@jpee80 wrote:For ASUS X570 MB owners, they just released a new non BETA BIOS (version 3001). This could be the solution.
Unfortunately, I had to return the CPU due to my frustration and loss of hours before I can try this. So far, few users in Reddit have some positive results after the update.
Unfortunately, 3001 bios is still has random restart 5 times in a 1 hour. (XMP ON)
and after update 3001. i cant rollBack 2607 (stabile) bios.
3001 bios says, ''its Not Proper Bios file'' for 2607 stabile bios (tryed others 2812/2816 and same)
mailed to ASUS
i think we have to wait for that.
Asus Tuf x570
3900x
4*8 (3600)
now i am on Default bios settings
xmp disable (:
Thank you for your info on the new Asus BIOS.
Are you saying that you can run CPB but its the XMP which causes reboots? I feel thats a different issue and I realise that you have a 3900X, not a 5900X.
I think we'll have to wait for Asus to integrate PatchD into the BIOS before we see some results..
Add me to this list. Very annoying/frustrating. 5900x with a B550 Gaming Plus on v151 Beta Bios. I have set everything to default, then manually disabled PBO, disabled C state controls, set soc voltage to 1.1, manually set vddp and vddg voltages. Ran about 14 hours without an issue, then went afk in a game for a bit and as soon as i came back in the PC restarted again. Always come back to a WHEA 18 error in event viewer on core 0. After this most recent restart, I went back into Bios and increased DRAM voltage by 0.05V from what it was default at. Up to 1.38V now. This is frustrating. Feel like I got a bad CPU.
@imraneo wrote:Thank you for your info on the new Asus BIOS.
Are you saying that you can run CPB but its the XMP which causes reboots? I feel thats a different issue and I realise that you have a 3900X, not a 5900X.
I think we'll have to wait for Asus to integrate PatchD into the BIOS before we see some results..
i have this problem random restart/kernel power for 3900x .
Bios 2802/12/16 and lastest bios 3001
All that bios having Random Restart problem in this settings XMP ON / ALL others Default.
now my bios is 3001 and i do not go back stabil 2607 bios
cos 3001 dont allow it.
now
3001 bios (default settings)
XMP disable (4*8 2666)
and Poor Performance
my pc
3900x
Asus Tuf X570 gaming
4*8 Corsair 3600
lastest chipset
lastest all drivers
On at least Asus boards, new 3001 bioses have been released (or at least started), and the good news is that this appears at least for me to largely fix the issues. I am also able to enable XMP/DOCP as well as PBO. I've had one blue screen, although I think that was driver related, and definitely not cache hierarchy related as with the other issues.
So far so good!
I have flashed back to Patch B non beta BIOS.
Windows still reboots.
Did an erase via BIOS. Now running Windows installer. The issue is so bad that I get reboots right after entering the installer :(
I tried running at both default 2133 and 3600 ram speed. I don't even remember it being this bad right after I build this PC. Has it degraded somehow? Should I connect the optional 4pin CPU power?
@imraneo wrote:I have flashed back to Patch B non beta BIOS.
Windows still reboots.
Did an erase via BIOS. Now running Windows installer. The issue is so bad that I get reboots right after entering the installer :(
I tried running at both default 2133 and 3600 ram speed. I don't even remember it being this bad right after I build this PC. Has it degraded somehow? Should I connect the optional 4pin CPU power?
I don't think the 4pin CPU power is optional :-)
What was the base voltage before you gave it a .05v increase? mine is .9v to 1.450v, should I make it 1.5?
Hey,
I set my vcore offset in bios to 0.0125v and (fingers crossed) havent had any random lockups/restarts for over 24hrs.
I am however running a 5950x so not sure if that value would be too much or not for yours.
https://hothardware.com/news/custom-pc-builder-ryzen-5-5000-zen-3-cpus-high-failure-rates
Jinxed it, after 28hrs of stability. Got 2 WHEA errors and freeze + reboot. Back to the drawing board ;\
same issue, already my second Zen 3 Chip (Problem appeared within the fist 4-5 Weeks)
now my new 5900X starts with the same reboots and crashes.
but what shocks me the most is that people rather accepting running a CPU way below stock and/or even disabling cores to be stable.
THIS IS NOT ACCEPTABLE AT ALL!
i just ordered a 10600KF and a cheap Z490 Board to troubleshoot everything else except my two B550 Boards and now three Zen 3 Chips.
i just bought a 6900XT and i am quite Happy with it but Zen 3 should be voluntary recalled and refunded for everyone that has the slightest issue.
Guys, all Ryzen series have had issues like these. I've had them with a 2600X which I bought at end of 2020, so you'd expect CPU, motherboard (B450, a 2018 model), and BIOS to be at least more stable and reliable than a Zen 3 which has been on the market for way less time.
That's what made me find out this topic, and reading just a few posts, including (and especially) those from fanboys who don't deny the issues but talk about them like they're something normal/expected with Ryzen, is what made me stop wasting my time.
Returned CPU and motherboard, bought Intel, everything works out of the box like it's supposed to.
You are not going to fix your problems with a BIOS update or a setting in the BIOS, AMD doesn't even address the issues, you think a miracle is going to happen anytime soon?
The day I start paying with "maybe you'll get all my money for your product, or maybe you'll get only part of it, or maybe you get them but they don't always work when you try to cash them"... then I will still not accept receiving a product that doesn't work.
Between recent stable bios (3405) and the helpful input of my fellow "AMD HARDWARE BETA TESTERS"
Greetings fellow 5900X commiserators!
I purchased a 5900X a couple weeks ago to upgrade my system, and I've been running into nothing but problems since then. I only wish that maybe I had found this thread sooner.
Until recently, I had been using a 3700X inside of my PC with these specs:
My PC has been very stable in this configuration for the last 8 months or so (GPU was a more recent addition, naturally) - I've never had any problems with any stress tests that I've thrown at it.
Everything changed once I swapped the 3700X for a 5900X. Now keeping the PC stable basically seems impossible without completely neutering performance (i.e. disabling CPB and PBO, disabling XMP). I've been running into the WHEA_UNCORRECTABLE_ERROR / WHEA-Logger ID 18 very frequently since then. I've tried pretty much every troubleshooting step I can think of:
At this point, I'm at a complete loss. Using any of the hardware configurations above along with the 3700X and everything works perfectly and I never run into any issues. But as soon as I put the 5900X back inside, it's a crapshoot.
I think what's possibly most frustrating is that these errors seem to be more likely to occur when the computer is idle rather than when it's busy. If I'm stressing the CPU, everything seems fine - but when the CPU switches back to idle or suddenly jumps from idle to busy, that's when the WHEA error seems to occur. Sometimes all it takes is to open a web browser after a reboot. I think this lends some credibility to the idea that some of these CPUs are not handling C-State transitions correctly - maybe a voltage change that's either too low or too high?
In any event, I submitted my RMA request for the 5900X to AMD yesterday. Reading through this thread, it seems like some people have had success with tweaking various voltage, current, and CPB/PBO settings. Realistically, all I want is a CPU that will work with default settings and XMP enabled. I don't really have a desire to mess around with voltages to maybe end up with something stable - and even if it seemed stable, would I still trust that stability if I knew that it would still fail with the manufacturer-provided defaults?
Sorry that this turned into a rant, but I guess I needed somewhere to rant.
For now, I've submitted an RMA request to AMD - we'll see where this goes.
Same here. I've tried everything. I'm disappointed the switch to AMD being an Intel user for 10 years turned into a sour experience. I've had issues with AMD CPUs and stability before back in the days before I switched to intel. I've had my Intel I7 2600k for something like 10 years without any crashes.. like ever. Not due to the CPU atleast, and always solveable issues.
I was thinking that now is the time to go AMD again as I really liked the 5900X, atleast on paper. My purchase has turned out to be a very bad experience. I've spent more time trying to solve this issue than I've spent actually gaming or using the computer.
I'm sorry AMD. You screwed up. I'm going back to intel.
"I'm sorry AMD. You screwed up. I'm going back to intel."
Same.. ordered a 10600KF for 199€ and a Z590 Strix E until Rocket Lake comes out.
i'll keep my 6900XT tho..
Hey, not sure if you're running HWInfo64(OCCT uses HWInfo) in the background, but please check this out.
In my case im running a 6900xt + 5950x combo. Testing now with no HWinfo running.
My PC would also reboot when it was in an idle state with the WHEA 18 error.
HWiNFO and OCCT are 2 separate programs, so no and yes if you have both running.
From descriptions it almost sounds like a "C" state or power saving issue related to sleep/hibernation. One in theory, should be able to use such features, I haven't in years and disable "C" states, cool n quiet, and use a high performance power plan. If you run an SSD it should be set to never turn off, even if you use a sleep/hibernation setting. SSD's don't like to be powered down, then restarted. They are much different than regular SSD's and do not require a sleep or low power state, since the firmware does that already.
C-states always been disabled along with all power saving options in the bios.
No special power plans for SSD's either, never had them.
Found out OCCT uses HWinfo to grab statistics, so that's no longer running either.
So i have the beta version that fixes the GPU issue.
Cache hierachy errors are still happening (just like before and even without running HWInfo.)
maybe it fixes this specific problem.
the broken CPU/Microcode/Bios/AGESA or whatever it may be is still existing.
i am aware of the HWInfo Issue..
but the reboots are existing far before they even announced the RDNA2 cards.
Alright, time for an update on my original post from earlier in the week.
I had originally requested a warranty replacement (RMA) from AMD and described the problem rather thoroughly, but the request was redirected to their technical support department instead. The CSR provided a list of eight suggestions to try, and to be fair to them, I have gone through them all and recently wrote back with my results. There were honestly no suggestions in the list that I hadn't tried before, other than explicitly setting the "Power Supply Idle Control" option to "Typical". Unsurprisingly, the suggestions haven't made any difference - WHEA_UNCORRECTABLE_ERROR randomly occurs anywhere between 30 seconds and two hours after booting. Stress-testing the computer seems to reduce the chance of it happening - letting the computer idle for a while and then launching some programs seems to trigger it (and even then, not reliably).
Since that time, I've tried buying another motherboard and moving my components onto that motherboard instead. I purchased an Asus ROG Strix B550 Gaming-F Wifi motherboard - I explicitly chose something that was a different manufacturer and different chipset than my regular MSI MAG X570 Tomahawk motherboard. After all was said and done: the WHEA errors returned. Everything seemed fine at first, but they always come back. I'll also mention that I tried this setup with a completely fresh install of Windows 10, as well as in Linux too - the error in Linux is a Machine Check Exception (if I get any error at all and not just "immediately reboot and black screen"), which seems to be a reasonably close analogue of the Windows WHEA error.
At this point, I don't know how I can conclude that the source of the problem is anything other than the 5900X itself. if I've tried two different PSUs, two different motherboards, two different RAM kits, two different GPUs, two different hard drives, and two different operating systems, but only one CPU causes the problem (keep in mind that my 3700X works fine in all of these configurations), then how can it be anything but something wrong with the 5900X itself? (For the curious, my production batch number is BG 2051PGS.)
I realize that there might be ways to force the 5900X into working by tweaking RAM and/or SOC voltage, disabling XMP, disabling PBO, disabling CPB, but honestly: at the end of the day, should we even need to do that? If a CPU doesn't work correctly with all of the default settings, then something is wrong, no?
In any event, I responded to AMD's technical support with my findings. I'm hoping they'll agree that an RMA is a reasonable way to go forward.
Hey there,
not over yet.
This is my first amd build. I had always Intel until now.
Actually I build 3 years ago for father amd build, but low budget PC.
I got my CPU Ryzen 5900X about 1 month ago. At start I had default settings in bios, only XPM was enabled. I start noticing in first week that my PC is restarting after it was in save Power Mode. Event ID 18.
WHEA-Logger
Processor Apic ID :0 (this one always shows)
Processor Apic ID :11 or 18 or 24...this is diferrent
Crashes happen about once a day or maybe once on 2 days.
A week ago I set in Bios Undervolting with PBO disabled. Since then I had only 2 restarts, its better but not gone. Restarts happen when PC is idle longer time (2-5 hours maybe). Never happened when CPU was under load. Really strange...
I decided to wait a few weeks if AMD will find solution for this, since I believe this mightbe software related (bios or chip drivers) and not hardware...but this can confirm only AMD!
If not I ll probably start RMA , just to get new cpu. And if I get new cpu , i will just sell it and go back to Intel.
Its really annoying to pay so much money and then we have so many problems , when u expect to have TOP high end rig working flawlessly.
Hey again, update from me..
Some content from my 1st post:
I got my CPU Ryzen 5900X about 1 month ago. At start I had default settings in bios, only XPM was enabled. I start noticing in first week that my PC is restarting after it was in save Power Mode. Event ID 18.
WHEA-Logger
Processor Apic ID :0 (this one always shows)
Processor Apic ID :11 or 18 or 24...this is diferrent
Crashes happen about once a day or maybe once on 2 days.
A week ago I set in Bios Undervolting with PBO disabled. Since then I had only 2 restarts, its better but not gone. Restarts happen when PC is idle longer time (2-5 hours maybe). Never happened when CPU was under load. Really strange...
And now some new info about my cpu.
I didn't get any new WHEA-Loggers (happy about it), but I had 1 restart, it happened only once (5 days ago).
Error in logs: BugCheck 1001 event (The computer restarted after a serious error. Error Code was:0x000000a0...)
Maybe important: 2 days ago I decided to reinstall W10 (anyway I had to make an image, so why not :). Only difference from previous install was that I disconnect from internet when installing windows and after I install all required drivers, then I connected Pc to internet.
First thing I noticed are better temps CPU in Balanced mode. Until now my temps started with 30 and after some load it spiked to 60-65 and then it stayed there even at idle. Which was very strange to me, since with Intel CPU this never happened to me before. When gaming cpu temps were beetwen 70-80.
Now temps of CPU are moving beetwen 30 (idle), 40-60 (under small load), 60-70 (under load), at longer loads maybe peaks to 80, but goes quickly back to 70. Main thing is that after I stoped playing or doing havy stuff and went back to idle , CPU temps returns to 30 degrees. I think this is big difference that could also have influence on whea events restarts... any thought from you guys about temps?
I tried CB23 on stock settings (undervolted and xpm enabled as i had it from the start):
MC score:21508, SC score:1611,1631 . Dont know if this is good, since many guys uses different ver of CB, maybe some of you can tell me.
Picture about temps:
People need to stop directly comparing Intel and AMD cpus as if they are built the exact same way. Zen cpus are by DESIGN meant to run hot. Temps even up to 90-95C are considered within specs.
Hi Anthos,
I guess you didnt understand my comparison well. I am not trying and I didint want to compare temps on generally , how hot can be amd od intel cpu.
I am saying that at idle cpu temps were to high. If you would read my post well, you would understand what i was trying to explain about my situation
This thread is about Whea errors.. so my reply is related to this ,not comparising! I am just trying to understand if temps can make difference...
@B_JuniorSo would you say reinstalling windows wihtout connecting to the internet and manually installing drivers helps? It would be great to know because im encountering the same issue with my 3600 and 5700XT. Did everything to fix the issue and reinstalling would be the only thing left before a RMA of the Hardware. I tried to fix the issue for 1 year(!) and no new driver helped.