cancel
Showing results for 
Search instead for 
Did you mean: 

PC Processors

CrispyCrunch
Adept II

Ryzen 5900x: System constantly crashing/restarting WHEA-Logger ID 18 and critical error Kernel-Power

Mainboard: MSI x570 Unify
Mainboard-BIOS: 7C35vA82 (Beta version)
CPU: Ryzen 5900x
RAM: Crucial Ballistix BL2K32G36C16U4B 3600 MHz, 64GB (32GB x2)
Drive: M.2 Samsung 970 Evo+ 1TB SSD
Graphics: SAPPHIRE Nitro+ Radeon RX 5700 XT
PSU: be quiet straight power 11 750w Platinum
OS: Win 10 Pro (64bit) - all updates installed
Chipset driver: 2.9.28.509 (released 2020-11-09)

I first assembled the PC with a Ryzen 3800x a week ago because it was unclear if and when I would get the Ryzen 5900x I ordered. Worked with the included AMD Prism Wrath CPU cooler for one week without any problems.

- Today I installed a Ryzen 5900x and a Scythe Fuma 2 CPU cooler.
- After 20 min the first crash/restart with the following entries in the Event Viewer: WHEA-Logger ID 18 and critical error Kernel-Power ID 41.
- Happens irregularly again and again, sometimes after minutes, sometimes longer: Windows freezes for a few seconds and then the PC reboots. Doesn't matter if load or not.
- CPU temperature between 30 and 40 °C
- Updated to BIOS and chipset driver mentioned above: Problem still exists
- XMP Profile disabled (RAM on 2600 MHz): problem still exists
- CMOS Reset: Problem still exists

Either there is a compatibility problem of something with the CPU, or the CPU is defective?
What to do? Really frustrating.

2 Solutions

Im having a similar issue, x570 aorus and 5600x. Have same errors on windows. 

Disable CBP and PBO and run it at default settings (3.7 ghz and xmp on). That works for me. 

View solution in original post

I got a new angle on this. So deactivating PBO and CBS definetely works, PC was running stable for a week now. But you'll loose performance.

So I wrote to the MSI support and the AMD support.

MSI suggested to try increasing the DRAM Voltage by 0.05 V, which I did. System seems to be stable, no crashes so far - neither in idle or while gaming.

View solution in original post

947 Replies

Solved my unstable 5900x on MSI B450.  Here my settings. 

 

Go into BIOS. Disable CBP, save, reboot. Go back into BIOS.
XMP ON Profile 1 ( I have 3200Mhz 16-18-18-38)
DRAM FREQUENCY to 3200MHZ
FCLK to 1600MHz
RAM Voltage 1.35

PRECISION BOOST OVERDRIVE [Enhanced Mode 3]

AMD CBS\
CORE PERFORMANCE BOOST [Auto]
Global C-State Control [Disabled]

AMD Overclocking\
ECO Mode [Disabled]

Precision Boost Overdrive [Advanced]
PBO Limits [Motherboard]
Precision Boost Overdrive Scalar [Auto]
Curve Optimizer [Disabled]
Max CPU Boost Clock Override [100MHz] (200 works too =slight increase of 50-80 pts in Cinebench r20 multi, but more W)
Platform Thermal Throttle Limit [Manual]
Platform Thermal Throttle Limit 255

0 Likes

Hey! just saw MSI x570 Tomahawk wifi got new bios beta up https://www.msi.com/Motherboard/support/MAG-X570-TOMAHAWK-WIFI did anyone tested already?

0 Likes

I'm testing it but with the new cpu

Didn't have to much time to test but for now it seems stable.

Before bios uodate and cpu change my system rebooted mainly in two scenarios

Near the end of blender benchmark and after minute or two in Witcher 3

Now I checked bkender few times - no reboots

Didn't have time for witcher

0 Likes

I've not got the same board as you but the beta bios with 1.1.9.0 has meant I can disable the curve optimiser and run PBO and XMP enabled without issue... Or so I thought.  The system runs hotter overall, so I took a quick look at negative numbers on the curve optimiser.  It seems stable enough where it used to reboot (cinibench button spamming) at -6, but then reboots randomly when not provoked.  I've now disabled the curve optimiser again and since had a further reboot at complete idle.  I'm just sending the cpu back now as clearly it's trash.

I already opened a RMA case so I was pretty desperate too, have you tried all my settings posted? Like 

PRECISION BOOST OVERDRIVE [Enhanced Mode 3]

0 Likes

I've not, I did wander what that did and had a quick google.  On the old bios I can make it 100% stable with +2 on ccx0 and +6 on ccx1 but I'm not happy with it regardless so back it goes. 

0 Likes

Just had another random reboot whilst I was away from the computer eating my dinner. 

Had a quick look at the WHEA 18 error for old times sake and discovered every single one since building the computer has happened with APIC ID 6 or 2 (mostly 6).  Can I assume this corresponds to core 2 and 6?  If so these are on ccx0 which has the fastest speed cores and therefore I was using a lower positive offset in the curve optimiser (assuming they were better than the slower cores on CCX1).  Perhaps I didn't need to use any offset on ccx1.

 

Has anyone else noticed its constantly the same APIC ID being reported on the WHEA 18 event?

0 Likes

Hey. I have had constant WHEA since building my pc a week before christmas. 

 

My specs are:

Msi x570 ace (latest Bios 1.1.9.0 agesa)

5800x with MSI MAG 360r Cooler

Msi 3070 Suprim x

1000w Gold PSU

G.Skill tridentZ 4000mhz cl 18

 

 

I have looked far and wide for a fix, or should I say, solution. What I have found out is that what works for X might not work for Y and so on, but here's atleast what worked for me:

My RAM is set manually to 3600mhz. (Have not had any crashes with XMP enabled with these perticular settings.)

Disabled Global C States

Changed PBO to Advanced, and set limitations to Motherboard. (For me thats 500ppt, 210TDC and 220 EDC. I did this because I noticed the currents almost capping in Ryzen Master while stress testing, as they are way lower than that by default.

After that basically all I did was setting the curve optimizer to +10 all cores.

Now I know that tweaking around in curve optimizer setting values per core, might be a better alternative, but while comparing my CPU (using CPU-Z) to a friends 5800x (who's using all default settings and had no issues at all.) I see that mine is slightly outperforming his, while at better temps. What does that mean? Well nothing cause there could be a million reasons as to why, but still, my cpu is not underperforming with these settings in any way. I have also monitored voltages for a bit, and I dont see alot of "overvolting" as youd presume when having CO +10 all cores. 

Basically all this means to me is that overclocking is going to be a nightmare of WHEA's. Which is a bit of a bummer since I intended to do some overclocking. But at least I have a functioning system with no permormance loss compared to the next guy, and for now, Im happy with that. 

That said, this is the first, and last time Ill ever buy a Ryzen CPU is this is not sorted in a future bios or something like that.

 

RooN

 

 

 


@JoltCola wrote:

Yep, AGESA 1.1.9.0, in most manufacturers' current beta BIOS releases, fixed WHEA on idle/near-idle for me and many others. Normally I would run away from a beta BIOS like it's on fire, but in this case I really do recommend it.


AGESA 1.1.9.0 BIOS, same crap. 

Lots of CPUs AMD sells are just s..t and BSOD at defaults with any AGESA, period.

For anyone starting an RMA for a 5950x I just got this from AMD: 

Please be informed as there is no stock for 100-100000059WOF inventory in our warehouse. The replacement part will be shipped once the stock arrives in the warehouse. I appreciate you cooperation in this regard.

No idea how long this will be.

I have 5900X BG 2045BGS on rog strix X570 and I had 3 WHEA 18 with different bios. Today I have the new 3602 bios and I'm testing

0 Likes

Keeping it updated, RMA was approved (not impressed I had to pay my own shipping for a 2 week old defective product), and has been shipped out.

Put in a 3600 as a temporary CPU while I wait, and all problems are gone. Was able to re-enable everything in the BIOS without any crashing, which means this $300 temporary CPU is operating head over heels better than the $800 forced to operate at 3.7GHz one

Really? fixed al 3600 and the other parameter in default mode and no more whea error?

thx

0 Likes

Yes, installed the 3600, reset bios to default, turned DOCP on, and system has been running with no WHEA errors or reboots ever since. So CPU was definitely the cause of the crashes in my case. 

0 Likes

hi!

DOCP?

In my case 5800x and 5900x have same error with this mainboard. 5800x work fine with mainboard with chipset B550

I try you recomendation in a few hours.

Thx

 

0 Likes

dont work at 3600 in my case -_-

0 Likes

Dear AMD - you have lost a customer for life.  Not only am I upset about my 5900X not running correctly but your RMA process is totally unacceptable for year 2021.

It took 2 weeks back in forth with your Tech Support to get an RMA authorized.  You then sent me a ground shipping label from West Coast to Florida which will take 7 day transit minimum.  You will then sit on your ass for 3 days inspecting this broken thing.  You will then send me a replacement which will take another week ground shipping back to West Coast.  I will have wasted more than 30 days on a brand new computer with a top tier CPU that I cannot use because of your terrible processor and customer service process.

I have loved AMD but this is unacceptable and with my latest experience I can never in good faith buy another one of your products again.

Team blue is the way to go even if they are behind in advancing tech.  I have never had these issues with them in the past and guess what they offer an Advanced RMA.  You should look into this, it's the year 2021.  Even other **bleep** companies like Seagate offer this.   

What will really piss me off is if the replacement CPU is also having issue.  I have no idea what I'll do at that point

DOCP - Dram Over Clock Profile (or something like that), it is the AMD name for XMP.

0 Likes

No and I currently don't have access to 3600 RAM. I will probably upgrade my RAM soon, but it will probably be 3200 again, since I don't think the performance gains are really noticable.

0 Likes
AlexBCS
Journeyman III

i just made an account to write that im having the same random reboots/WHEA erros and it have been a nightmare!!


I just got a new build with MSI B550M Pro VDH WIFI and a R5 5600X with a pair of 8GB 3200 RAM.
After installing windows i just kept getting these random reboot and i was so worried, i swapped my PSU thinking it was the problem but no! The pc kept restarting on me.

After reading all this thread i tried all the different workarounds from the different users but it just don't work for me, i don't even know if i need to wait for a BIOS update to fix this, i tried using the latest MSI beta BIOS but it won't even post and i ended downgrading to the more "stable" version i had. this is really getting me crazy

0 Likes

It's personal!

How crazy!
I tried an older AGESA, but nothing.

If I put the RX480 it works fine.
It started when I put the MSI RX 5700XT. And I say more, it is not a problem with the VGA hardware. It is something in the mainboard BIOS that I have. The reports are extremely similar.

[ AMD Ryzen 7 5700X (step B2) | CM MASTERLIQUID PL360 FLUX | MSI MPG B550 GAMING EDGE WIFI - 7C91v1G | 32GB DDR4 3600MHz XPG SPECTRIX | MSI GeForce RTX 4060 GAMING X 8G 8GB GDDR6 ]
0 Likes

!! FIXED !!
10 days no more Random RESTART
I just sold my Corsair 2*8  16GB  3600 CL18 (CMK16GX4M2Z3600C18) 
this kit makes random restart on DOCP mode 
i bought Crucial Ballistix Bl2k16g32c16u4b 32 Gb Ddr4 3200  2*16 3200
working good on DOCP mode
Official Corsair Support SAID!! 
''in general AGESA is causing compatibility issues for kits above 3200MHz''
so TRY non 3600 and above kits
 

Tuf X570 (3001) didnt update 3202 beta bios
w10 64 20H2
3900x

I checked this some time ago

Such change didnt helped

 

I returned my 5900x an bought 5950x

When I bought it new bios was released.

Currently no crashes bit don't know if this was bios or cpu

I'll tell you something. This problem is with AMD BIOS and drivers. It doesn't make any sense. I run STRESS TEST here and nothing goes wrong. I'm trying to simulate the error.TESTE STRESSS.png

[ AMD Ryzen 7 5700X (step B2) | CM MASTERLIQUID PL360 FLUX | MSI MPG B550 GAMING EDGE WIFI - 7C91v1G | 32GB DDR4 3600MHz XPG SPECTRIX | MSI GeForce RTX 4060 GAMING X 8G 8GB GDDR6 ]
0 Likes

I can 99% force the issue using Cinebench R23. Load it up and click start / stop on the single threaded benchmark repeatedly, Sometimes alternating between multi thread and single thread.  It appears that the sudden spikes in frequency and parking multiple cores into sleep cause the issue.  There have only been 2 or 3 times where this hasn't worked and I've then gone on to see a random reboot at idle hours later.


@artur_aragao wrote:

I'll tell you something. This problem is with AMD BIOS and drivers. It doesn't make any sense. I run STRESS TEST here and nothing goes wrong. I'm trying to simulate the error.TESTE STRESSS.png


 

0 Likes


@willeywilson wrote:

I can 99% force the issue using Cinebench R23. Load it up and click start / stop on the single threaded benchmark repeatedly, Sometimes alternating between multi thread and single thread.  It appears that the sudden spikes in frequency and parking multiple cores into sleep cause the issue.  There have only been 2 or 3 times where this hasn't worked and I've then gone on to see a random reboot at idle hours later.


@artur_aragao wrote:

I'll tell you something. This problem is with AMD BIOS and drivers. It doesn't make any sense. I run STRESS TEST here and nothing goes wrong. I'm trying to simulate the error.TESTE STRESSS.png


 


Hummmmmm!

So you think that this could really be a unique problem with AGESA in the BIOS. I am wrong???

I definitely have no hardware problem. And I do not believe that they are problems with my secondary SSD or with the new memories, since these errors were not present with the RX480.

I just installed the newest VGA drivers. Although these drivers still have problems to solve since before version 20.11.2 WHQL. Now I have the driver 20.12.1 WHQL.

[ AMD Ryzen 7 5700X (step B2) | CM MASTERLIQUID PL360 FLUX | MSI MPG B550 GAMING EDGE WIFI - 7C91v1G | 32GB DDR4 3600MHz XPG SPECTRIX | MSI GeForce RTX 4060 GAMING X 8G 8GB GDDR6 ]
0 Likes

I think it's a bit of both.  Some CPUs are working fine on the AGESA and even able to undervolt.  Some CPUs aren't working OK unless they are overvolted.  The latest AGESA has made mine 99% stable at stock voltages (with pbo & xmp) - now removed for RMA.

The CPUs that are "fixed" with a bios update will never overclock or perform quite as well as one that worked fine on older BIOS', therefore it is still a hardware issue.

0 Likes

That's why I decided to return 5900x

Over the years standard was:

Bad silicone - low possibility to oc

Good silicone - good oc

Now good silicone - system works

Bad silicone- reboots and crashes

 

Im not planning to oc but this situation is unacceptable to me

Also, just to add, my 5950x which is going back tomorrow is BG 2048SUS and judging by the serial number one of the first 100 made...

0 Likes

Yep, all I can say is AGESA 1.1.9.0 greatly improves the WHEA black screen reboots on idle. I can't definitively say it fixes it, but I haven't seen one in over a week.

0 Likes


@artur_aragao wrote:

I'll tell you something. This problem is with AMD BIOS and drivers. It doesn't make any sense. I run STRESS TEST here and nothing goes wrong. I'm trying to simulate the error.


Wrong. The problem is there is NO stress test available which can even remotely find out an instability in a modern CPU. 

A stress test just puts an all cores load on the CPU, in this case a modern CPU like the 5000 works at LOWER frequences per core and there's no problem. 

It's when the CPU is loaded with a single or a few core load it works at maximum frequences. This is where the problem occurs. The alternate loads (start-stops) help to switch the load between different cores and increases the probability to hit a bad core at a bad time. This is achieved by start-stopping Cinebench in single and multi core modes for example, as mentioned above.

The stress test which is needed should be like Memtest - it should generate different load patterns, different cores numbers, different durations, different types of instructions, all flip-flopping. Nobody has written such a test yet - to the great joy of crap makers.

0 Likes

Friends,

It's hard. I decided in addition to updating the Adrenalin derivatives to version 20.12.1 WHQL, to return to BIOS 7C91v153. When reconfiguring the BIOS I went through a parameter which I remembered that in the past I had a huge problem keeping it enabled. It is the Spread Spectrum feature. I kept everything in the standard of what I usually configure, but I disabled this feature. Let's see what happens.

I had no problems with crashing with version 7C91v13, but I want to check the Spread Spectrum parameter, because by default it is enabled and the correct thing would be not to come, as it negatively affects overclocking. I'm not a fan of overclocking, but when the CPU does BOOST, we are using automatic overclocking. And we still have the XMP of the memories.

I will bring news soon.

[ AMD Ryzen 7 5700X (step B2) | CM MASTERLIQUID PL360 FLUX | MSI MPG B550 GAMING EDGE WIFI - 7C91v1G | 32GB DDR4 3600MHz XPG SPECTRIX | MSI GeForce RTX 4060 GAMING X 8G 8GB GDDR6 ]
0 Likes

T


@AlexBCS wrote:



After reading all this thread i tried all the different workarounds from the different users but it just don't work for me,



Try two things (separately):

1. Disable Core Performance Boost.

2. Raise SOC Voltage to 1.15V.

All this with XMP (DOCP) disabled and Memory voltage = 1.35v if so written on the memory modules.

If neither works then your problem is probably not CPU related.

0 Likes

Those are things that I already tried.

Even tho I tried them anyways just to give it a try, beforehand I cleared CMOS to try it with everything at default.

Tried disabling CPB, and as (not) normally would happen, I was able to use the pc for like 15 mins then it suddenly rebooted.

Then I cleared CMOS again and tried adjusting SOC Voltage, my system won't even post and it just boot loop and even on BIOS setting it would suddenly reboot.

I already have tested every component, did a mem test with no errors, I'm on latest AGESA (not beta) version for my BIOS. Tried the beta version of the 1.1.9 version but it wouldn't even post and it boot looped. Swapped the psu (both of them, brand new 600w gold), the max time I was able to use the pc was around 40 mins.

Someone on Reddit suggested me swapping the motherboard, (he had the same problems and same components), he just did that and suddenly everything worked perfect.

I could buy a new mobo but i'm in process of returning my cpu, if the problem persists I will try getting another mobo. This is just insane. 

 

 

0 Likes

I think, though I usually don’t recommend it, but in your case it’s better to take your PC to some computer workshop where they have a stock of spare hardware to find out the culprit by swapping. 
In this case it can be a problem with other hardware and not the CPU. Or even an assembly problem: some other guy had the similar symptoms and the problem was the GPU not fully inserted into a Gigabyte MB slot ( Gigabyte MBs have a double lock system there and the GPU should be inserted like a DDR module, very firmly at both ends.)
Your situation has its advantage: the problem is so obvious and easy to replicate that a few minutes of testing is enough to say “yes or no” after each swap.

0 Likes

Hello, friends!

A bit more stats from my side.

On december I got 5950x - there were random black screens restarts usually in idle workload (without WHEA event) with or without CPB\PBO. For example if I leave PC for a night I got ~2 of such restarts. Spent a week about trying to solve this using different combinations. No luck.

Change processor to 5900x and now it starts to work even worse. With 5900x - I got this WHEA error and bsod immediately during simple benchmark run in Shadow Of Tomb Rider. So I disable again CPB and PBO - but the same black screen restart in low load were reproduced. So I'm changing CPU again (this time I sent back to the shop cpu+motherboard+memory - to be retested together in the shop's service center). I think there is a very small chance that there is an issue in motherboard (Gygabyte B550 Aorus Master with latest firmware), but for 99% it's CPU again. That's really frustrating. I did not expect such low quality control for CPUs 

0 Likes

I did not read all of the 33 pages of replys.
Just wanted to say i also have a 5900x and a 6900xt graphics card and i also experience black screen and then the computer turn off. It help me a little bit that i disconnect my vr equipment then the computer become more stable.. but it still crash in non vr games such as mount and blade 2 after 30min - 2 houres of gameplay no matter what.
its not a heating issue, and i can run all kind of stress test and benchmark wihout problems.

I have come to the conclution the drivers or bios is bad and have bugs that is the main reason this happen.
( personaly i have used 2 different bios since mid december 2020 and i had crashes with both. )

0 Likes

I stopped getting WHEA 18 since a windows Update between December and now. So that might have been it. With the crashing using a 6000 series GPU, try this. Go into "tuning" in the Adrenaline software. Select "automatic" for the overclock, accept the warning, nothing to damage here. Select "undervolt GPU" write the number down. Select "manual" overclock mode and enable all the disabled settings. Now set you min GPU clock to within 100Mhz of the max clock. Set the voltage to the one you wrote down. Set the "power limit" to max, this allows the GPU to add more power under load if it needs to. Disable "zero RPM" fan mode. The stock fan curve should be ok but you can increase it. Hit "apply" top right, and save the profile by clicking the 3 dots, top right. 

What we did is stop the huge voltage swing when the card goes from around 500Mhz to 2500Mhz for gaming and the card will run smoother as well. Bonus is less heat due to less voltage with more performance as the card won't be fluctuating frequency so drastically. Not a full overclock or anything, just fine tuning at that point. I have an RX 6800 and doing this stabilized my card enough to over clock it as well. 12.2.1 drivers.RX 6800 Settings.png

FYI, none of your issues will be triggered by a stress test. Probably because they run in "borderless windowed" mode even if they look "full screen". Try 3D Mark TimeSpy, that usually will trip a bad GPU setting. 

"It worked before you broke it!"
0 Likes
majkiel69
Journeyman III

I had the same issues as you describe - random WHEA erros and Kernel-P - crashes ind Windows.

My hardware:

- CPU: ryzen 5900 x
- MOBO: MSI Mpeg x570 Ace
-GPU:  rtx 3080 Aorus master
- RAM: 32 gb G-skill Tridentz Z RGB 3600 hz cl 16/19/19/39 (4x8 gb sticks)
- PSU: Corssair RM850i gold
- AIO: nzxt 240 mm
- Case: Corsair 4000d airflow with 4 fans
- Drive: samsung 970 Evo plus 

I instal recent bios / chipset drivers etc. from MSI site.
Had latest windows updates , all drivers etc. 
My system was crashing randomly in use with Windows - rather in normal usage like internet browser etc. not stress test.

I had read tone of post in redit and here, watch some YT videos and for my case helped: SET IN BIOS:
- Global C-state control - disabled
- Power Supply Idle control - set to Typical current voltages.

I have PBO / PBC on - and XMP1 on - from now (one day of testing) no crashes. If something will change i will let you know.
My thoughts: - C-state brings cores of CPU while idle, on some too low voltages, that they cant come back from sleep and system goes down - when they was needed.