cancel
Showing results for 
Search instead for 
Did you mean: 

Processors

CrispyCrunch
Adept II

Ryzen 5900x: System constantly crashing/restarting WHEA-Logger ID 18 and critical error Kernel-Power

Mainboard: MSI x570 Unify
Mainboard-BIOS: 7C35vA82 (Beta version)
CPU: Ryzen 5900x
RAM: Crucial Ballistix BL2K32G36C16U4B 3600 MHz, 64GB (32GB x2)
Drive: M.2 Samsung 970 Evo+ 1TB SSD
Graphics: SAPPHIRE Nitro+ Radeon RX 5700 XT
PSU: be quiet straight power 11 750w Platinum
OS: Win 10 Pro (64bit) - all updates installed
Chipset driver: 2.9.28.509 (released 2020-11-09)

I first assembled the PC with a Ryzen 3800x a week ago because it was unclear if and when I would get the Ryzen 5900x I ordered. Worked with the included AMD Prism Wrath CPU cooler for one week without any problems.

- Today I installed a Ryzen 5900x and a Scythe Fuma 2 CPU cooler.
- After 20 min the first crash/restart with the following entries in the Event Viewer: WHEA-Logger ID 18 and critical error Kernel-Power ID 41.
- Happens irregularly again and again, sometimes after minutes, sometimes longer: Windows freezes for a few seconds and then the PC reboots. Doesn't matter if load or not.
- CPU temperature between 30 and 40 °C
- Updated to BIOS and chipset driver mentioned above: Problem still exists
- XMP Profile disabled (RAM on 2600 MHz): problem still exists
- CMOS Reset: Problem still exists

Either there is a compatibility problem of something with the CPU, or the CPU is defective?
What to do? Really frustrating.

2 Solutions

Im having a similar issue, x570 aorus and 5600x. Have same errors on windows. 

Disable CBP and PBO and run it at default settings (3.7 ghz and xmp on). That works for me. 

View solution in original post

I got a new angle on this. So deactivating PBO and CBS definetely works, PC was running stable for a week now. But you'll loose performance.

So I wrote to the MSI support and the AMD support.

MSI suggested to try increasing the DRAM Voltage by 0.05 V, which I did. System seems to be stable, no crashes so far - neither in idle or while gaming.

View solution in original post

947 Replies

+1 actually. 
I freak out still reading advice to replace this/that/spend tons of dollars/disassemble walls in the house to check if aliens secretly installed a neutron emitter inside a brick to disturb your fine godlike AMD CPU. 
Months after it became obvious that the CPU is usually the problem. 

TO ANYONE HAVING THE PROBLEM:

1. Disable Core Performance Boost in the BIOS. This will limit your CPU to the base frequency and heavily cripple the performance. The problem “magically” disappeared? THE CPU IS BROKEN. 100%. Period. RMA it. 
2. Has any difficulties to RMA and want to live with this POS somehow?
- Turn the CPB back on.

- Open Curve Optimizer and manually INCREASE the curve. Set the minimum amount your system becomes stable with. It’s +8 - +10 for most POS AMD 5XXXx.

@tim716 wow don't let a PC issue bother you so much, haven't you already fixed yours by replacing your CPU?

There's many pontential causes and I've yet to see a statement from AMD stating any known issues with the CPU. Even if there is an issue not yet disclosed doesn't mean everyone is affected by it. If you have evidence of this please post a link.

There's always a failure rate with all electronics and some will need to be replaced but without knowing the percentage this occurs you don't know if this is good or bad.

Also I wouldn't disable CPB or other performance settings, as you should be able to use those features. If the troubleshooting I posted on page 68 from AMD doesn't help you solve the issue then contact AMD or your local retailer and replace your CPU.

As for mine I've solved it by setting my Power Supply Idle Voltage to typical. No perfomance degradation at all. It's likely caused by my old PSU but if I rule that out I'll dig further.

There's many causes, I'm not going to jump to conclusions.

0 Likes

I’ve never said that any BSOD or reboot of any PC is always due to the CPU issue. But we’re talking about the specific issue mentioned in the name of the thread.

That’s why I wrote Paragraph 1. IF disabling CPB eliminates the problem - then the problem belongs to this thread and it IS 100% the CPU issue. 
IF not - then the culprit is probably something else, not the CPU. 
That’s called “differential diagnostic” in medicine.

As for me - it’s not so easy to RMA an OEM CPU in Russia, so I had to resort to #2.

Also I’d never have written with such confidence if I did’t know that the advice above has helped dozens of people already.

0 Likes

@tim716 I can also solve mine by disabling CPB and other methods such as disabling C States. This doesn't mean I have the same cause.

This is a thread about a particular error, I have that error, based on this thread it seems there's likely different causes and solutions, there's not enough data to point to a common cause.

This is why we're discussing it and trying different things in order to better understand the issue. This isn't medicine, it doesn't need to be that complicated.

The reason I posted AMD's troubleshooting on page 68 is because it matched the troubleshooting I would do and looked like it would help people. Everyone should start with AMD's troubleshooting and go from there, disabling CPB or C states is merely a work around.

Surely there has to be a replacement process in Russia I can't imagine there not being one?

0 Likes

If you can solve your cause disabling CPB than the next workaround should be sending that POS of the CPU back to the ones who made it. 
It might be a different cause, but it’s DEFINITELY the CPU cause because the CPB setting affects only the CPU and nothing else. And also because many people did these two steps, identified the issue, RMAd their CPUs, got the replacement and had no problems ever since.

These two steps might sound complicated, but this is much easier than spending hundreds and thousands of dollars on new PSUs, MBs, etc.

There is no AMD warranty for OEM CPUs, and the RMA process with a Russian seller looks like this: they hit the CPU with a hammer and then say “You have no warranty because you bent the CPU legs”

0 Likes

@tim716 I used to RMA OEM CPUs all the time, however I never tried from the customer side, you might need to have your brick and mortar do it for you but they don't sound very helpful. I can't imagine the industry can survive like that

I'm fairly experienced with PC troubleshooting and I'm not 100% sure it's the CPU in my case. It actually seems more likely PSU, but I won't be able to rule it out without upgrading it. It's fine though, I wouldn't mind an upgrade and I have another system I can put my old one in so it's not really going to cost me. Otherwise I'd borrow some parts or replace the easiest part. It's just a process of elimination.

As to why I don't RMA the CPU, with Power Supply Idle control set to Typical, the CPU actually runs really well without issue and performance is top notch. I don't really even need to replace my PSU except to confirm whether the issue was the PSU or not.

0 Likes

Another 5950x user here with crashes. Turned off performance boost (so basically throttling the CPU) and it's fixed the crashes. AMD need to put out a response about this. There seems to be loads of us with these issues. These aren't cheap CPUs. 

 

I'm running a Rog Strix Gaming E. My friend has an identical build and had the same issue. Turning off the performance boost in the bios fixed his too.

0 Likes

@joel232 follow the troubleshooting on page 68 from AMD, truning off CPB is only one workaround and not a fix. You may actually fix the issue on page 68. If you go to RMA you'll need to do those steps anyway.

Also don't assume there's some major issue, there could be, but we have no evidence to support that. The only info I can get is the failure rate is less than 2% which is well within the norm.

0 Likes

The evidence is this very thread 75 pages long and many others on Reddit and basically everywhere where they discuss CPUs.
I’ve never counted %s, but for me and any other victim here and many other places it IS a major issue. I wasted $300 on an absolutely useless new PSU, lots of people put even more money down the trash due to this issue. One fellow on Reddit spend over a THOUSAND USD replacing part by part his whole PC. 
Turning CPB off is neither a solution nor a workaround. It’s a TEST. Like a coronavirus swab. If it comes out positive (the issue disappears) - alas, you can skip all the steps on page 68 and any other steps. Your PC definitely has AMDosis, not any other hardware “disease”. 
For the protocol: I did ALL the steps from page 68 (and many other things too) before I even found out about CPB and Curve optimizer. As useless as it should be. 
If AMD requires it you should do these Page 68 steps of course - they’re reasonable and there’s nothing difficult there. But if you have the “positive CPB” - be assured those steps won’t help.

0 Likes

Probably the essence of the issue should be explained. 

1. What is the problem?

Probably the most of you have heard the words “silicon lottery” and “binning”. That means chips aren’t born equal. Some of them are capable to work stable at higher frequencies, some are less capable. Some of produced chips are inevitably unable to work even at the specified clocks. Such chips normally should be either put down the trash or sold as cheaper and less capable products (with specs they are capable to reach). 
Not with AMD. Many below-the-spec Ryzen 5xxx (I have no idea what percentage, but definitely not just 2%) have somehow reached the shelves. If you’re having the issue and your CPB off test is positive - you’ve bought one of those below the spec CPUs. You’ve totally lost the lottery. Bad luck. Sorry. 
2. If so, why the affected CPUs usually crash at idle and low load tasks, but remain rock stable under all cores burn-in?

This attributes to peculiarity of Core Performance Boost operation. 
The affected CPUs become unstable at the MAXIMUM frequency. This isn’t connected with TDP or other limits, the cores just crash because their frequency is too high for a given voltage. The same effect as if an overclocker undervolts his CPU too much. The difference is that your CPU is already politely over- undervolted for you by AMD. 
The problem is the cores DO NOT work at maximum frequencies during heavy multi core loads, since the limits kick in. So your CPU remains stable. 
But when your system is idle (some background processes are executed) or under low single core load - the few active cores operate at the MAXIMUM frequency. This is where the crashes occur.

3. Why Core Performance boost off and Curve optimizer “help”?

The first one is obvious. The cores remain at their base frequency with this setting. 
The Curve optimizer is an overclocking tool, which reviewers use to under volt their specially selected review CPUs to show the outstanding performance of new groundbreaking CPUs. 
You commoner can use the same tool to OVERvolt your POS of a CPU to UNDERclock it and make it stable at lesser performance, which this POS was born for. Yes, it works. 

 

0 Likes

@tim716 I understand you're angry but you really shouldn't assume so much. Your anger should be placed at a poor return system in Russia not that a product fails from time to time, because they all do.

Have you even looked at the troubleshooting on page 68?

I know about silicon lottery and everything else you mentioned and much more you don't need to talk about all that. You're focusing on only one potential issue and while it might be the issue there is several that can cause the same thing.

If you're not interested in trying to solve the problem then why are you here? Maybe just sell your system and buy Rocket Lake, you'll be much happier.

However if you want help, I'll help if you ask. If your dead set it's your CPU then just replace it and move on. This isn't worth stressing over it's just a computer.

 

0 Likes

With all due respect, it's just a computer but I have spent around $7,000 on this. If the CPU is faulty, which it is, AMD should instantly replace it. The fact this thread is at counting almost 80 pages long shows it's a wider issue.

0 Likes

Yes, I obviously looked at the AMD steps on page 68, I mentioned it in the message above. 
As I said, I tried all these steps long ago. All this is very old news to me. Actually I have the most of the settings mentioned in the “steps” in my PC as stated there, these are the correct settings. But they don’t help with the issue. 
I’ve been fighting the issue, conversating on forums and in chats, asking, learning since December. I tried TONS of things besides these page 68 steps, and only the things I’ve described above really work, both for me and MANY other people. I’m in no way the “author” of these, thanks again to the people who really discovered them. But I’m 100% sure of what I’m saying. 

0 Likes

Hello everyone.,

Im a Intel long term user since 30 years and when I saw all amazing scores of 5950x using PBO it was time to move over to AMD.

Now I've waited for 3 weeks to finally get all hardware and I built a monster with custom water loop (what a waste of expensive cooling as it cant be used for 5950 oc) to keep it cold and stable OC using PBO as thats where this cpu shines.

Asus Crosshair Vii Hero x570
G.Skill 64 GB 4x16 3600 14 15 15 15  
be quiet! straight power 11 platinum 850W

 

So Ive read every single guide and video how to do this properly and started with low negative underclock curve to get the boost needed, and slowly moved up, Ended up in crashing in every single setting from -30 to +5 on the specific Core that crashes. I ended up in restricting more and more settings and ended up in all auto and low speeds, still crashes no matter what, its a complete waste of all those settings as all they do is crash the computer without any OC using PBO. This is the common one:

A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Bus/Interconnect Error
Processor APIC ID: 29   <---- this being the actual core that cause the crash, in my case 30 or 29   9/10 times No matter what I adjust these cores to they still crash from manual -30  + 10 or auto.

 

Ive tried every single suggestion on this 78 page thread and disabled C state idle, and all kind of stuff,

Ive tried downclock memory and losen timings.

The final conclusion is, just like all you guys, PBO is broken on AMD, its totally broken and doesnt work sadly, you are lucky if you can run a few days, but sooner or later it Blue screen with any of the cores killing the stability. 
And whats shocking is that it should be a SUPER EASY fix for AMD as its not when I go high and stress it and crash thats when its to much OC; but its when im doing NOTHING, idle, or watching a youtube clip at most, just like many other says too, after full stress for hours its rock solid as soon as I stop that, and it goes back to idle it can crash within 1 min to 20 hours.

Outrageous, I wish I found this thread before I bought AMD for the first time, as I see this have been an issue for a year !?!?!?!

And no response from AMD.

 

So the only options are either run stock, disable PBO and have half decent performance, or do a traditional all core multiplayer increase  with a few 100mhz depending on cooling , or go back to Intel.

 

Have to say im very disappointing, and this was the first time for 30 years I moved from Intel to AMD and heavily regret it as their selling point for awesome performance doesnt work and seems to been broken since they released still. And even more sad, not a single word from AMD on this problem that affects everyone that want to use PBO.

 

0 Likes

Does the system work with PBO turned off and RAM set to the default speed and timings?

0 Likes

Yes, it works fine with stock cpu settings with stock ram or XMP settings and PBO disabled.

As soon as I enable PBO it gets the WHEA ID 18 CORE crashes within minutes to hours. No matter what setting and ive tried hundreds for days.

 

 

0 Likes

Question is if anyone have a 100% stable PBO setup. I doubt it when doing research on this topic,. As Ive seen PBO users are fine with semi stability and seems to be ok with either a crash every now and then or disable things to get it stock to make it stable.

Ive read a lot of guides and introductions to PBO showcases from AMD users and AMD them self and they end up in saying that the tricky part is to test stability as its unstable when idle and ending with the words AMD should improve the algorithm to improve stability. This tells me its not fully stable and never been, and wont be until PBO algorithm is fixed. AMD should have a whole team on this since day one, since its seem to pull them straight down from the lead over Intel and more and more people are going back to intel, or like in my case, got hurt from first bad experience using AMD. 

Im on stock settings now, and everything is rock solid and ive build a super cooling rig to try to get some good performance using PBO which is broken sadly.

 

0 Likes

I have a setup that is 100% stable using PBO and a 5950X.

 

What i did with my setup, was enable PBO and then set the PPT/TDC/EDC parameters manually.  Normally these are set to 142W/95A/140A for a 105W processor when PBO is disabled.  When you turn on PBO, the only thing that changes is that those values are set to the motherboard levels.  Often, the motherboard limits are something ridiculous that you will never hit due to voltage constraints or temps anyway.  We have even had users who reported ASUS motherboards where the motherboard max TDC was higher than the EDC.  

If you aren't familiar, the TDC is maximum sustained amperage that is allowed through the VRMs, while the EDC is the transient boost amperage.  Having a sustained amperage higher than the boost amperage makes no sense, and can lead to all kinds of problems.

So what I did was install Ryzen Master on my PC with the latest chipset driver.  Turn on PBO and boot the system.  In ryzen master you can then see what your PPT/TDC/EDC limits are.  These are your board limits and the limits you should not exceed.  

 

Back in UEFI, set the PPT/TDC/EDC to 142/95/140, scalar 1X, no clock boost.  This setting should perform identically to PBO off.  Now slowly raise those settings until you are happy with the temps under load, the voltages, or start to see errors again.  I wound up at 215W/140A/160A on my 5950X.  I didn't encounter any errors here, but the chip was at 1.3V on an all core load and around 70C.  Raising the limits higher just rapidly increased the temps without any real boost in performance.  

After I had the limits set, then I added a core overclock of +100 and played around with the curve optimizer.  But I really got my wattage/amperage limits set first.

 

Give it a shot.  I feel just allowing the motherboard limits can be problematic sometimes. 

0 Likes

Hi!

In my case for now, replace Mobo/PSU and with 5900x same issue, but i think is a GPU problem. The issue is similar with reboot and WHEA error.  Ihave this crash from 5800x but this CPU work fine in other mobo. 5900x pass al test in this moment.  Only need a few days to check the GPU in other system

0 Likes

I did this manually and started at default 142 95 140  then Ive tried everything in between this and as many suggest on forums 200 200 150 and various other combinations, then ive tried 0-150mhz boost, ive tried from 0 to -30 to +5 in demanding cores (best cores) and evrything in between insane amount of testing with each step 5 at the time.

Ive tried auto V core and ivce tried 1-20-1.35

Ive tried all power savings idle c state etc.

Ive tried every single setting, it still crashes as soon as PBO is enabled no matter what settings.

Scalar ive tried auto and 1-10

ive used Ryzen master to check the numbers and ive tried it as well manually, but it doesnt control curve optimizer so cant do it there. thats what im doing in bios.

I tried follow this proper guide as well for PBO: https://www.youtube.com/watch?v=dU5qLJqTSAc&t=1297s used their settings and tried to make it remain stable on any setting without success. 
Does anyone following this guide get anywhere stability with such PBO boost? Cause I sure dont even with insane cooling system.

Every guide ive followed are around 200 200 150

Ive also tried lower ram to 3200 and the stock 3600 with lose timins with less and more vram

Ive reset bios tried diff versions and redo the settings.

 

I simply cant gain any performance on this chip like everyone else in this long thread, and Ive gone through every page to see if someone finds a solution to make PBO stable at some boost performance, but Ive only seen everyone come back with no it crashed with this new setting sadly. Looks like the CPU is already maxed out and there is no real room for OC in any forms other than minor.

0 Likes

I'm not sure why they would be advising 200/200/150.

 

First off a PPT of 200W will be your limiting factor.  That is about what I hit with a TDC of 140.  Furthermore, setting EDC lower than TDC makes no sense at all.  Why would the short term boost amperage be lower than the sustained amperage?  200/150/200 would make more sense as that isn't too far off the PPT=215W, TDC=140A and EDC=160 amps that I tried.

 

"Ive tried every single setting, it still crashes as soon as PBO is enabled no matter what settings."\

So even if PBO is at 142/95/140 it crashes?  Make sure the PPT is 142, the TDC is 95 and the EDC is 140.  If it crashes there and not with PBO disabled that is interesting as the two should be exactly the same.

0 Likes

What do you set your negative curve to?

And do you manually set Vcore to 1.3 for all cpu?

What CPU z score do you get with 215/140/160 A ? (or cinebench R23, geekbench5 to get a score measurement) 

What do you set the Boost override CPU mzh to?

I just tried this setup and got worse score than stock settings. Havnt crashed yet, but odd its lower multi and single rate than stock using above setup you using for me.

0 Likes

I didn't set the vcore manually, as doing that disables all boosting.  If you set vcore, or bclk manually it completely disables boosting and just runs in manual mode.

 

At 215/140/160 I observed my vcore was at 1.3V when running Cinebench R23.  I don't really want the voltage going over 1.3V on an all core load, lower threads it is okay to see 1.45 or so.  That's why I didn't keep raising the TDC, even though I probably had a little bit more thermal headroom.

 

Boost override is set to +100 MHz.

 

My curve optimizer is set to -5V for the two best cores on CCD0, and -10V for the remainder on CCD).  Everything on CCD1 is set to -15V.  Getting around 29000 in multicore with Cinebench R23.

ajlueke_0-1621602785047.png

 

 

0 Likes

@joel232 It's just a computer, there's a lot worse that can happen it life. AMD will replace your CPU if you ask them just contact them or your local retailer whatever is easier.

@tim716 I'm glad you went through those steps (edit: sorry going through old notes today I noticed we'd spoken about this before, it's been too long and I haven't had enough time to devote to this, sorry I forgot), it is easy to miss something but if you're confident then you should replace your CPU too. Surely you can contact AMD and RMA with them at least or the place where you go it from?

2% of 1 Million is 20,000 people and they've sold a lot more than that, how can you be so confident?

This doesn't bother me much because I know it's normal to get faulty parts sometimes when building PCs and I'm well used to the process. My PC is actually running at full spec although using a bit more power at idle. I will replace the CPU though if a new PSU doesn't let it powersave as well as it's supposed to. Also I've seen a lot worse in life, recently.

0 Likes

“AMD will replace your CPU if you ask them just contact them or your local retailer whatever is easier.”

https://www.amd.com/en/support/kb/warranty-information/oem

AMD doesn’t agree with you, strange eh?

And resellers are, well, different. In my case “Replace your CPU” means “Throw your CPU away and buy a new one, reward AMD for their crap”. Thanks for the advice.

As for “2%”, where did you get it from? Why not 20% or 0.02%? I don’t state any percentage since I don’t know. If you know - can I see the source please?

 

@tim716 I'm sorry you have such shady retailers over there

Will the retailer not even start the RMA process on your behalf?

The less than 2% I got from Hardware Unboxed when they were asking Australian retailers. I saw other figures that were lower but if you Google it the industry average is 1 to 3% and 2% was in the middle so sounds reasonable.

It'll take time for more accurate figures to come out but often once it hits 5% or 6% it becomes news worthy.

Just Google 'ryzen 5000 failure rate' and you'll get a bunch of results all in similar ranges.

0 Likes

That guy Zin... I think he is paid by AMD to be here.  I suspect he's the same guy from four months ago, and he changed his name.  Yeah. I said it. I think he's a paid employee.

Maybe is or maybe not. yet what he is telling just RMA it, if troubleshooting poster here did not help. Which is actually any individual should do.

@rumple the world is not flat.

Don't be stupid, I have not changed my name. Everything I posted is good troubleshooting steps you should of already done yourself. I posted AMD's list because it was easier than writing it up myself and better than what I would do. Anyone that has worked in the industry of building PC's would know this stuff.

I thought people were here to troubleshoot and try to find the problem, but have they all left?

Just use some common sense and don't jump to conlussions.

0 Likes

Funnily enough, disabling cbp on my ryzen 3950x has made it way more unstable, to a point only with cbp enabled it works better. I dont get it, I ran 3d mark and other games and no problems. It just that it randomly, very randomly restarts, no bluescreen after BIOS update(had CRITICAL STRUCURE CORRUPTION) error before I Updated bios. Now i am doing my 3d texturing work and no issues so far. I am just confused. I gave Ryzen and Gigabyte 1 chance and they spit right at my face... 

0 Likes

@NewAMDGuy I'm not a fan of Gigabyte due to having issues with them in the past but I was just unlucky, my friends have no issues.

I've built hundreds of PC's over the years sometimes you enounter issues, you see them from all companies, sometimes you get bad batches.

I like this stuff I like getting to the bottom of it but it's not for everyone. I could of replaced my CPU months ago but that may not fix my issue. However that is one thing ruled out if you want to go that way.

I'd skip the CBP advice, and start with the AMD steps on page 68, I'd already completed prettt much all of them in my own troubleshooting, they're pretty decent things to check. If that doesn't help your pretty much down to hardware issues which means replacing parts or the entire system.

You shouldn't need to mess with how the system auto boosts, that should just work on default settings.

Depending on your circumstances it might be better to have a professional solve it for you. Can you have the store you bought it from sort it for you?

0 Likes

Do you have any records in this folder?
"C:\Windows\LiveKernelReports\"

[ AMD Ryzen 7 5700X (step B2) | CM MASTERLIQUID PL360 FLUX | MSI MPG B550 GAMING EDGE WIFI - 7C91v1D | 32GB DDR4 3600MHz XPG SPECTRIX | ASUS Dual GeForce RTX 3060 OC Ed 12GB GDDR6 ]
0 Likes

Something I have seen a lot on forums, since November 2020, are several users with 3600MHz memories or more experiencing this problem. Some adjust the speed down or up and the problem goes away.

Take a look here, a post I picked up on adjustments.

5950x + x570 godlike, WHEA CPU Bus/interconnect errors when FCLK > 1600 | MSI Global English Forum -...

[ AMD Ryzen 7 5700X (step B2) | CM MASTERLIQUID PL360 FLUX | MSI MPG B550 GAMING EDGE WIFI - 7C91v1D | 32GB DDR4 3600MHz XPG SPECTRIX | ASUS Dual GeForce RTX 3060 OC Ed 12GB GDDR6 ]
0 Likes

"Something I have seen a lot on forums, since November 2020, are several users with 3600MHz memories or more experiencing this problem. Some adjust the speed down or up and the problem goes away.

Take a look here, a post I picked up on adjustments.

5950x + x570 godlike, WHEA CPU Bus/interconnect errors when FCLK > 1600 | MSI Global English Forum -...

 

I am not sure that you fully understand these errors. The WHEA bus interconnect errors (19) are a completely different thing to the WHEA 18 cache hierarchy error.

0 Likes

Hi Guys, I just wanted to share my experience with this WHEA-Logger ID 18 error and how I solved it without loosing performance.

Similar as most of the people here I was suffering the crash having everything at stock settings, my PC components are:

Ryzen 5800x

Asus Rog Strix B550-F Gaming (latest beta bios 1801)

DDR4 Crucial ballistix 2x8GB 3600Mhz (running FCLK at 1800Mhz 1:1)

The crash was ocurring only when playing a game, for instance Hitman 3, I never had the chance to complete a mission without a restart or black screen (I had to restart manually when that happened), the crash was always ocurring before 30-45 minutes during gaming. It never crashed on windows desktop or opening programs, only when gaming, not even when running benchmaks such as 3DMark or Prime95.

I started troubleshooting following some tips given at this thread:

- Disabling PBO

- Disabling Core Performance Boost

This worked, no more restarts or black screens, BUT the big downside is that 5800x will be running at base frequency which is 3.8Ghz, so I wanted a solution where I was able to find a balance between performance/stability.

I started by:

- First run: Disabling PBO, disabling Core Performance Boost, this worked, but performance is impacted because of running at 3.8Ghz

- Second run: Disabling PBO, didn't work

- Third run: Disabling PBO + disabling Global c-states + Power supply idle control: typical + increasing EDC to 200A, didn't work

- Fourth run: Disabling PBO + Increasing CPU Load Line calibration to Level 5 which is the max value in asus bios + giving RAM 1.36v (default is 1.35v). These settings worked, last night I ran hitman and finished a complete mission without any restarts or black screens, about 2 hours. I know that just by disabling PBO it didn't work in the second run, but maybe that setting + increasing CPU load line calibration added the required voltage stability that CPU needs.

Possibly my CPU does require a certain voltage to meet default clocks and because of vdrop I was having instability at specific frequencies which was causing those restarts or black screens. The bios Auto setting for CPU load line calibration was not helping to meet those voltage requirements.

I dont believe it was the 1.36v given to the ram as I ran 3 hours of Karhu RAM test (I had to buy a license) and it was stable at 1.35v in Windows. I heard that this RAM benchmark was very good to test RAM stability.

Now I need to see which setting or combination made the difference so I can just apply that one/those ones, but I believe the CPU Load Line calibration at level 5 (max value) is the main solution.

Try to test those settings and see if it works for you, I think this can apply for any ryzen 3 CPU.

3Dfx, your solution did not help in my case

 

Try benchmark in Shadow of Tomb Raider 

1. Start benchmark 

2. Wait sometime as it runs ~ 20 seconds

3. press Esc 

4. Wait until results of benchmark appear

6. Wait ~ 5-10 seconds

7. If no crash - go to 1

That was the most stable and fast way to get this on my build.

"Increasing CPU Load Line calibration to Level 5" - this did not help me. Only pbo +5 for all cores fix issue for me - but I got 1.3 V instead of 1.1 V in multi core load and +20 degrees to temperature there...

The issue usually occurs after switching from heavy load to light load. See also 
https://www.overclock.net/threads/replaced-3950x-with-5950x-whea-and-reboots.1774627/page-45#post-28...

0 Likes

I'm bothering with this 5800x issue for almost a month now, yesterday I found that limiting 4400 clock and 1.35 vcore helps. never crashed for 5 hours, then i farther search and found this thread and im shocked tons of people experiencing this even at latest bios flashed in my x570 carbon wifi. Your load line calibration is big help. I'm stable at 1st hours of using this. need more testing for thise profile and thanks a lot. BTW this chip was not suppose to behave like this. It should be stable on stock, i paid around 500$ here in my country just to get this cpu.

0 Likes

Same problem here. I wait for the fix from AMD. Otherwise I will switch to the competition. Is not acceptable after paying a lot of money on this chip to not be able to work with it. After the blue screen, I lost all my work. 

0 Likes

any word from amd on all this? or are they going to let us keep juggling with something that is supposed to work without problems from the start... 

0 Likes

Are all you guys with WHEA reboots running BIOS with AGESA 1.1.9.0 or 1.2.0.0? They fixed the problem for me.

0 Likes