cancel
Showing results for 
Search instead for 
Did you mean: 

Processors

CrispyCrunch
Adept II

Ryzen 5900x: System constantly crashing/restarting WHEA-Logger ID 18 and critical error Kernel-Power

Mainboard: MSI x570 Unify
Mainboard-BIOS: 7C35vA82 (Beta version)
CPU: Ryzen 5900x
RAM: Crucial Ballistix BL2K32G36C16U4B 3600 MHz, 64GB (32GB x2)
Drive: M.2 Samsung 970 Evo+ 1TB SSD
Graphics: SAPPHIRE Nitro+ Radeon RX 5700 XT
PSU: be quiet straight power 11 750w Platinum
OS: Win 10 Pro (64bit) - all updates installed
Chipset driver: 2.9.28.509 (released 2020-11-09)

I first assembled the PC with a Ryzen 3800x a week ago because it was unclear if and when I would get the Ryzen 5900x I ordered. Worked with the included AMD Prism Wrath CPU cooler for one week without any problems.

- Today I installed a Ryzen 5900x and a Scythe Fuma 2 CPU cooler.
- After 20 min the first crash/restart with the following entries in the Event Viewer: WHEA-Logger ID 18 and critical error Kernel-Power ID 41.
- Happens irregularly again and again, sometimes after minutes, sometimes longer: Windows freezes for a few seconds and then the PC reboots. Doesn't matter if load or not.
- CPU temperature between 30 and 40 °C
- Updated to BIOS and chipset driver mentioned above: Problem still exists
- XMP Profile disabled (RAM on 2600 MHz): problem still exists
- CMOS Reset: Problem still exists

Either there is a compatibility problem of something with the CPU, or the CPU is defective?
What to do? Really frustrating.

2 Solutions

Im having a similar issue, x570 aorus and 5600x. Have same errors on windows. 

Disable CBP and PBO and run it at default settings (3.7 ghz and xmp on). That works for me. 

View solution in original post

I got a new angle on this. So deactivating PBO and CBS definetely works, PC was running stable for a week now. But you'll loose performance.

So I wrote to the MSI support and the AMD support.

MSI suggested to try increasing the DRAM Voltage by 0.05 V, which I did. System seems to be stable, no crashes so far - neither in idle or while gaming.

View solution in original post

947 Replies

I had tons of WHEA errors and reboots on BIOS with AGESA <1.1.9.0 without HWinfo running. That may be a contributing factor but not the sole one.

just trying something else I changed the PSU yesterday for another working one from another PC and another whea 5 mins ago.... I cant´ belive this......... I have tried everything, I have not tried to RMA the cpu yet because I cannot guarantee that it is the problem, and the store has no more units, I cannot be sure of that or anything, this is the most frustrating thing that has happened to me after building my own computers for more than 30 years (all of them was intel)

0 Likes

Therein lies the severity of this predicament. Parts are scarce, the pandemic has complicated everything. Most of us have no way to quickly remedy our situation for our computers that we rely on for work and communication. I feel like most people in our shoes are probably nervously waiting for a bios and/or driver update to solve their problems.

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW

I took a WHEA right now. I purposely left the system resting and when it tried to return, POW! WHEA.

But analyzing it, I identified that in my case it may be happening because of the NIC. Strange that.

Log Name: System
Source: Microsoft-Windows-Kernel-Power
Date: 01/02/2021 22:20:14
Event ID: 172
Task Category: (203)
Level: Information
Keywords: (1024), (4)
User: SYSTEM
Computer: AV...
Description:
Connectivity state in standby: Disconnected, Reason: NIC compliance
Event Xml:
<Event xmlns = "http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name = "Microsoft-Windows-Kernel-Power" Guid = "{331c3b3a-2005-44c2-ac5e-77220c37d6b4}" />
<EventID> 172 </EventID>
<Version> 0 </Version>
<Level> 4 </Level>
<Task> 203 </Task>
<Opcode> 0 </Opcode>
<Keywords> 0x8000000000000404 </Keywords>
<TimeCreated SystemTime = "2021-02-02T01: 20: 14.2964689Z" />
<EventRecordID> 25434 </EventRecordID>
<Correlation />
<Execution ProcessID = "4" ThreadID = "600" />
<Channel> System </Channel>
<Computer> AVALON </Computer>
<Security UserID = "S-1-5-18" />
</System>
<EventData>
<Data Name = "State"> 2 </Data>
<Data Name = "Reason"> 6 </Data>
</EventData>
</Event>

I will work on what I have achieved now by analyzing the logs. I also suggest analyzing your event logs for information right after the Critical event, as I found out exactly that the error occurred with my NIC. And I see that there is a new driver update that Dragon Center has not yet identified in the MSI driver database.

New Adrenalin 21.2.1 drivers are out. I will also test them. This problem identification caught my attention: "AMD is currently investigating end user reports that Radeon Software may sometimes have higher than expected CPU utilization, even when a system is at idle. Users who are experiencing this issue are encouraged to file a bug report in Radeon Software.".

[ AMD Ryzen 9 5900X | XPG LEVANTE 240 | MSI MPG B550 GAMING EDGE WIFI (MS-7C91) - 7C91v16 | 32GB DDR4 3600MHz XPG SPECTRIX | EVGA GEFORCE GTX 1650 SUPER 4GB | XPG CORE REACTOR 850W | SSD 970 EVO NVMe M.2 250GB ]
0 Likes

"I still cannot boot with any FCLK setting over 1866. Anything 1900 and up I get WHEA errors, bus/interconnect.   I've tried everything" you are lucky - my CPU has these errors on 1600

0 Likes

"5900x in an MSI MEG ACE X570, running all BIOS including AGESA 1.2.0.0 beta.  I still cannot boot with any FCLK setting over 1866. Anything 1900 and up I get WHEA errors, bus/interconnect.   I've tried everything.  The CPUs are not performing up to my expectations.   To put this in perspective:  I have over 25 years as a paid IT professional.  People like the US Treasury paid me to run their servers, in the rooms and basements that don't exist.   I've built over 4000 computers, and countless servers.   I've never met a problem I couldn't fix, until now.  So I've initiated an RMA claim.  Again, to put this in perspective... 12 years ago, I built an Intel I7 920 stepping D0 computer, it overclocked by 55%, ON AIR, and it's still running at almost 4ghz today, not the original 2.66ghz at which it shipped."

 

@rumple Do you realize OC isnt supported nor guaranteed by manufacturers?

memory (an IF) frequency official spec for Ryzen 5xxx is up to 3200 (1600).

0 Likes

Try not using an MSI board. I use Gigabyte for over 20 years with no hassles. My first AsRock experience was very bad but I still use it now that all the bugs are worked out. Here's what you can expect from that 5900X: AMD Ryzen 9 5900X Review | PCMag

For this explainer, FCLK will be referred to as "IF". It's extremely hard to find a Ryzen 5000 series that can run at 1900 or 2000 IF, even with the latest BIOS patches. Perhaps, sadly 1800 IF is the legitimate ceiling for the entire Ryzen series. Albeit mine is a 3600X, 1800 IF is the limit when AMD advertises to run 3733Mhz RAM for the "sweet spot", which is 1866Mhz IF. Going that high, mine runs fine for a while, then crashes start with error 18 being the most frequent. I dropped back to 3600Mhz RAM with custom timings/sub-timings and set the IF to 1800, end of the issues.

Now we know about the servers in the basement of the US Treasury Dept. Great job on national security! LMAO!! Comparing those servers running a totally different OS, ECC RAM, multiple CPU boards, is apples to eggs. They run special Intel CPU's with the "ME" removed (now anyway) and everything is held to a much higher spec than what you're building here.

Why you can't "fix" this issue may well boil down to your many years in the Intel field (no pun intended) and little AMD experience. My back round is 20+ years building PC's for myself, the main stream user, Fortune 500 corporate office PC's, and a few government contractor company PC's. Most were AMD builds, in fact 5 were not. I've built in the hundreds not the 4000+ area but the concept becomes the same over time. Anyway, AMD is a unique "experience" and for those that like to tinker. Click n go is not AMD for the most part.

Your expectations are a bit high and exceed what AMD officially supports. Some will hit 1900 on the IF or 2000, not many. If you can get to 1866 IF and 3733 RAM, you're more than golden. Run some tests at that speed, especially if you can get the RAM to run at CL 16 and a Trfc of around 288 or 298. The scores will likely beat running at the higher IF you're chasing.

I also assume you have DDR4 4000 or 3800 to even try booting at over 1866. Because with AMD the IF must be half the RAM speed to work properly. Hypothetically trying to run DDR4 3200 and booting to 1866 IF should result in a non boot. Going with DDR4 4000 and 2000 IF is technically possible but every chip is not equal and I would consider that to be the "silicon lottery" of chips. DDR4 3800 and 1900 IF is another "unicorn" but might be easier to achieve with some other CPU/voltage tweaks.

Lastly, running faster RAM to slower IF results in decoupling from the nice 1:1 ratio we all are after and will lower single core test results. However, if you run fast enough RAM or your application needs that higher frequency and doesn't use much in the single thread area, you can have some performance gains. 

At the end of the day if you can run DDR4 3600 at CL 16 with decent custom timings, your IF at 1800, PBO +200 on, scalar 10X, SOC set to whatever works on the 5000 series (1.10v on the 3600X), and you're running error free, you've won! If you drop the SAM support and can go back to a non-beta BIOS like 7C35v1A, you might be able to get to the higher IF. Might... Other wise RMA over and over and good luck finding that 1900-2000 IF. 1866 should be doable with RAM at 3733, results will vary. As someone stated AMD supports RAM of 3200Mhz for the 5900X, so 1600 IF is your "guaranteed" max that AMD feels is "ok". 

Hopefully some of this info helps and there's no mysterious black van out front of your house after the big reveal, lol.

"It worked before you broke it!"
0 Likes

Got new chipset drivers 2.11.26.106 from Asus for my TUF x570 Pro today. Nothing new posted on AMD website as I post this. Typically I have several WHEA crashes a day on my system. Going to leave everything the same and see if the these new drivers affect the stability at all.

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW
0 Likes

That didn't take long. Got a WHEA bluescreen while watching youtube.

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW
0 Likes

Esteemed,

Follow this link. I'm there too.
https://forum-en.msi.com/index.php?threads/amd-ryzen-memory-support.283351/page-49#post-2040613

They found a way that solved two cases. It is still under analysis and I applied it here too. I put it here on SOC Voltage 1.150v too. Let's see how it goes from now on. I'll give it a few days and then return. If this is the case, it could only be so. I have two tickets open with MSI and we could shed that light on them. I already shared this thread for them and another one from AMD.

[ AMD Ryzen 9 5900X | XPG LEVANTE 240 | MSI MPG B550 GAMING EDGE WIFI (MS-7C91) - 7C91v16 | 32GB DDR4 3600MHz XPG SPECTRIX | EVGA GEFORCE GTX 1650 SUPER 4GB | XPG CORE REACTOR 850W | SSD 970 EVO NVMe M.2 250GB ]

IS! It didn't work for me.
Looped.

I am getting a new font. I'm suspicious that it could be her. Wait!

[ AMD Ryzen 9 5900X | XPG LEVANTE 240 | MSI MPG B550 GAMING EDGE WIFI (MS-7C91) - 7C91v16 | 32GB DDR4 3600MHz XPG SPECTRIX | EVGA GEFORCE GTX 1650 SUPER 4GB | XPG CORE REACTOR 850W | SSD 970 EVO NVMe M.2 250GB ]
0 Likes

Unfortunately SOC Voltage 1.150v did not work for me.
It seems to be something with the source itself.
I'm getting an XPG CORE REACTOR 850W.
Waiting to arrive.

[ AMD Ryzen 9 5900X | XPG LEVANTE 240 | MSI MPG B550 GAMING EDGE WIFI (MS-7C91) - 7C91v16 | 32GB DDR4 3600MHz XPG SPECTRIX | EVGA GEFORCE GTX 1650 SUPER 4GB | XPG CORE REACTOR 850W | SSD 970 EVO NVMe M.2 250GB ]
0 Likes

Your sig shows you running a 3700X. That is best at 1.10v SOC. I'd test the ADATA RAM thoroughly to 800% coverage with: MemTest Manual (hcidesign.com) 

Also try running just 2 RAM sticks, one in A2 and B2, Ryzen's tend to not like over 3200Mhz with 4 slots filled on certain boards. It has to do with board topology as in "daisy chain" (which most MSI are) or "T" which tend to work best with all 4 slots full at any speed since they do not share RAM lanes going to the CPU. If you're still having issues with a 5900X and error 18 specifically, RMA the CPU. Many have needed to RMA the 5900X for this reason and you may be fighting a lost cause.

"It worked before you broke it!"
0 Likes

Am I out of the loop? Has AMD stated somewhere that the 18 errors are only fixable by replacing the CPU? I was trying to wait until a bios update but should I just start an RMA?

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW
0 Likes

If your RAM is under 4000Mhz and you're not trying to get over 1800Mhz on the IF in BIOS, your cooler is properly seated, and your at your wits end, yes, start the RMA for the CPU.

It depends on board a little and your RAM config too. If your populating all 4 rows, try just the A2/B2 and see if it works, that eliminates a board or RAM issue. If you have a 4000Mhz or higher RAM kit, try manually setting it to 3600Mhz and manually set the IF to 1800mhz. It's a bit of a unicorn to get the 5000 series to be able to run 4000Mhz RAM or 3800Mhz for that matter. Running 4 RAM sticks at more than 3200Mhz seems to be a persistent issue on some boards. 

My 3600X had an error 18 problem that I seemed to fix by lowering my RAM/IF speed form 3733Mhz/1866Mhz to 3600Mhz/1800Mhz. The problem took a few months to develop for whatever reason. I know mine is a 3000 series but the same problems they had seem to be continuing onto the 5000 series. Like AMD claiming that the 5900X can run 2000Mhz IF is total non-sense. Lucky to get past 1866mhz. But attempting to get there will most likely trigger the error 18.

There's no clear answer in some cases but RMA has worked for some. Like I said, if you tried everything it's probably time to get the RMA going or return it to vendor for an exchange.

"It worked before you broke it!"
0 Likes

When I first built my system I tried the 4000/2000 configuration and it wouldn't even boot. I immediately set it to 3600/1800 and was able to run and install windows. I had a lot of random WHEA's with those settings but after troubleshooting and following recommendations I have been running  at 3200/1600 with DOCP off and manually set timings. It's been much more stable but I still get a WHEA crash from time to time.

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW
0 Likes

@Raziels_Lament 

Did you try using DRAM Calc to find the best timings? You might want to just get a set of DDR4 3600 (3200 since if you're using 4 sticks might work best at 1600 IF) set the IF to 1800 and call it a day. My TeamForce Extreeme Gaming DDR4 3733 was able to down clock to CL 16 3600Mhz and much lower sub timings than out of box but not all memory is equal. Mine is also Samsung B die. You can try turning Gear down off or on as well to see if that helps. Wrong TrFC settings will trip error 18 as well. Here's the equations to figure out the Trfc and tRC: 

Trfc=tRC X 6 or 8 (6 usually is best, 8 is looser), Trfc2=Trfc/1.346, Trfc4=Trfc/1.625, tRC=tRAS+tRP, tWR=CL (or less)

Those can help dial in the RAM. CL 16 is pretty solid for a latency for most RAM. Basically the first 4 numbers can be 16, unless your RAM starts out at less than 16. DRAM Calc is the best starting point really. DRAM Calculator for Ryzen (v1.7.3) Download | TechPowerUp

Then test to 800% coverage with this: MemTest Manual (hcidesign.com) I bought the Deluxe to run it off DVD prior to boot and it unlocks the entire test for $14 vs. $44 for Memtest86

I've had GSkill sticks go bad not test out fully, and found them unstable it enough builds that I no longer use them. Most of the time they use cheap Hynix chips, sometimes they use "ok" Micron but nothing "stellar". Especially for AMD I've found brands like TeamForce/Group, Crucial Ballistics, and Corsair to be more reliable/compatible. 

There are some guides for tuning the voltages for the 5900X as well. VDDG is one setting that can improve stability. Much of what made the 3000 series work, still applies to the 5000 series. Like the voltage setting the one OP mentions is ok because the CPU will limit it's voltage intake from the socket. Leave Vcore alone usually.

So try some of that before RMA of the CPU because of that DDR4 4000 issue, a different RAM kit might fix it if this RAM can't scale down properly. 4 sticks for 32GB might be causing an issue with frequency too high and RAM timing will be much different using 4 sticks as well. Some info on 4 Vs. 2 stick thing: AMD Ryzen: 4 vs. 2 Sticks of RAM on R5 5600X for Up to 10% Better Performance - YouTube 

"It worked before you broke it!"
0 Likes

I will try using the DRAM Calculator next time I get a chance. My GSKill modules use samsung chips and yes I have 4 sticks of single rank memory so I should be good if I understood Steve. I started on the RMA process because well, I kinda just ran out of patience but I will these suggestions in the mean time.

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW

dram.pngSteve's a smart guy most of the time. He admits to sucking at memory settings and hates to do them. I hated it until I learned it, now it can be time consuming but I don't BSOD or no boot very often adjusting RAM. Single rank is faster and generally more stable. There are other versions of Samsung as well not all are B die. Also with DRAM, use the A0/B0 PCB setting to calculate timings. See the photo, that's what the base screen looks like. You input details like the CPU socket (AM4), RAM speed (desired or actual), set the PCB to A0/B0, calculate "safe". "Fast" is ok if it works. Use the lower Trfc it gives you to calculate your other 2 Trfc's in BIOS. 

"It worked before you broke it!"
0 Likes

Ok. So I used the calculator to get timings for 3200 using the safe setting. I input all the recommend settings in the bios and used the formula you previously posted to get the other two Trfc's. I will test run these settings against my normal day to day activity and see what kind of stability it offers me. I still feel that the average system builder should be able to use QVL ram and expect auto or DOCP settings to work with out causing system instability.

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW
0 Likes

Yes, we all do. It seems AMD likes to make "enigmas". I've been building PC's for over 20 years, had my own business, mostly AMD builds too. Always a learning curve but then Ryzen came along....

"It worked before you broke it!"
0 Likes

All the attempts pointed out here were used and did not have different effects. I'm just waiting for the source to arrive to take my position on how much. This is ridiculous.

[ AMD Ryzen 9 5900X | XPG LEVANTE 240 | MSI MPG B550 GAMING EDGE WIFI (MS-7C91) - 7C91v16 | 32GB DDR4 3600MHz XPG SPECTRIX | EVGA GEFORCE GTX 1650 SUPER 4GB | XPG CORE REACTOR 850W | SSD 970 EVO NVMe M.2 250GB ]
0 Likes

mtavel's post is basically my experience with a 2600X, just change their "30" to my "20" years of PC building, and the error that I'm getting, which is different, but in the end it seems that it is an underlying issue with Ryzen in general.

I've had around 36 hours of apparent stability, couldn't manage to get it to give me the same error again, until it just started again, for no apparent reason at all, and now it's been hours without being able to reproduce it.

I don't really believe that a BIOS update could fix issues that, in a way or another, seem to be there since the beginning.

There's plenty of similar issues marked as solved within hours of receiving a replacement CPU or RAM kit, we don't even know if these people got errors again and just gave up at that point.

We also shouldn't forget that there's millions of people who don't ever run stability tests, let alone for many hours, and who don't really care if they get a BSOD or software crashes every now and then, and blame them on Windows or the application that crashed.

And here we're talking about problems that don't even show up with 40+ hours of Prime, 20+ hours of Aida, and 3 pass of MemTest (twice), in my case.

Right, that's what's really annoying, I had WHEA black-screen reboots when my PC was idle or near-idle, which made them very difficult to debug as it could take a day or two to occur.

0 Likes

Right, that's what's really annoying, I had WHEA black-screen reboots when my PC was idle or near-idle, which made them very difficult to debug as it could take a day or two to occur.’

That is the main reason for AMD to make a statement about such faults - so we can speed up the process of replacement and get valid CPU faster. But AMD is still silent like everything goes fine and there are no these critical issues at all

Just wanted to share an update from my end. It seems the issue was fixed with 1.1.9.0 bios update on my MSI tomahawk x570 didn't have any more restarts after that it's been a bit more than two weeks so I thinks it's solved...

0 Likes

I have mobo MSI MAG X570 TOMAHAWK WIFI, and still have WHEA 18 crash on default settings with following BIOS versions

7C84v14 (Updated AMD AGESA ComboAm4v2PI 1.1.0.0 Patch C)
7C84v153 (beta version with AGESA 1.1.9.0)
and now I've tried 7C84v15 (Update to ComboAM4PIV2 1.2.0.0) - same error on default settings

Crash was never reproduced during tests in Ryzen Master, OCCT, CPU-Z bench or Cinebench.

Crash was not reproduced in Cyberpunk 2077.

It takes up to 10 minutes to reproduce it in RDR2 game.

Easiest way for me - 1-2 minutes of playing Quake Champions, just custom game with bots - easy WHEA 18 error with reboot.
Maybe because of specific CPU usage but crash is very stable!

Solution to fix WHEA 18 crash for me was same as I did on previous BIOS version (https://community.amd.com/t5/processors/ryzen-5800x-system-crashing-into-reboot-while-under-gaming-l...) following tim716 advice:

  1. Global C-State Control  - switched from [Auto] to [Disabled]
  2. Power Supply Idle control switched from [Auto] to [Typical Current Idle]
  3. Precision Boost Overdrive (PBO) from [Auto] to [Manual] with following parameters:
    1. PPT 200W
    2. EDC 333A
    3. TDC 333A

I don't know if there is more stable or with less changed parameters way to remove crash (except of CPU replacement)

5900x | MSI MAG X570 TOMAHAWK WIFI | 32GB Crucial Ballistix RGB 32 GB (2 x 16 GB) DDR4-3600 CL16 | EVGA RTX 3080 10 GB XC3 ULTRA GAMING | Fractal Design Ion+ 860 W

UPDATE:
DISREGARD. I TAKEN ANOTHER WHEA.
I am testing another solution until the source arrives.

I'm still not really sure about that, but after I disabled the features marked in the image below, I no longer had WHEA. I was very suspicious of these resources that I always used in the configuration of the mainboard. That was the configuration that also made with the MSI X370 XPOWER GAMING TITANIUM.

I disabled it and the WHEA stopped. I was suspicious between this session and the ErP. The problem is very clear when the error occurs in a certain part of the system boot, preceding the entry and when the system enters an idle state. I remembered that I always activate these features and on the last attempt, I disabled it.

I cannot guarantee the solution yet. I need to do more tests. I would really like the staff to test it too. It may not fit in all situations.

MSI_SnapShot.png

[ AMD Ryzen 9 5900X | XPG LEVANTE 240 | MSI MPG B550 GAMING EDGE WIFI (MS-7C91) - 7C91v16 | 32GB DDR4 3600MHz XPG SPECTRIX | EVGA GEFORCE GTX 1650 SUPER 4GB | XPG CORE REACTOR 850W | SSD 970 EVO NVMe M.2 250GB ]
0 Likes

I am trying these parameters right now.

CPU loadline Calibration Control: Mode 3
CPU NB Loadline Calibration control: Mode 5

VDDP Voltage: 1,100
VDDG CCD Voltage: 1,100
VDDG IOD Voltage: 1,100

SOC voltage: 1.150

[ AMD Ryzen 9 5900X | XPG LEVANTE 240 | MSI MPG B550 GAMING EDGE WIFI (MS-7C91) - 7C91v16 | 32GB DDR4 3600MHz XPG SPECTRIX | EVGA GEFORCE GTX 1650 SUPER 4GB | XPG CORE REACTOR 850W | SSD 970 EVO NVMe M.2 250GB ]
0 Likes

___jedi___, it looks like we have similar way to reproduce the WHEA crash - see https://community.amd.com/t5/processors/the-most-stable-way-to-reproduce-whea-error-on-5900x-or-5950...

 

Try to disable Core Performance Boost only - this should help

0 Likes

Ivan, if I will disable Core Performance Boost - that will bring CPU to stock frequencies and that is not how it suppose to work by default (I hope) - otherwise why do AMD introduce this performance boos and overclocking out of a box?

So I hope that this can be fixed with proper power management in BIOS, at least some settings shows great improvements in stability.

5900x | MSI MAG X570 TOMAHAWK WIFI | 32GB Crucial Ballistix RGB 32 GB (2 x 16 GB) DDR4-3600 CL16 | EVGA RTX 3080 10 GB XC3 ULTRA GAMING | Fractal Design Ion+ 860 W
0 Likes

Yay, another WHEA 18 crash. Starting to get some buyers remorse.

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW
0 Likes

___jedi___, yep, the CPU should work on the stock settings - if it does not - return it back. I do not like the idea of playing with the BIOS - as it’s needed to add more power to make the CPU stable. I prefer just to get a working sample on the stock settings. 

if Core Performance Boost = off helped in your case - this means that there is an issue with the CPU in a stock mode - why not to change it for a working one?

0 Likes

So after spending time to tweak ram to stock settings and then soon after get 2 more random WHEA 18 crashes I decided to test my ram again just to make sure its still good. Ram memtest for several hours (I know longer is better - all night is best) just to give me some confidence that my ram is indeed ok, the test yielded zero errors. CPU has to be the culprit - bios? chipset drivers? What's it gonna take to run stable? I have the chipset drivers that were released on the 4th. I have the latest bios (3402) from Asus. Seems all I can do then at this point is wait for a stable bios or proceed with the RMA.

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW
0 Likes

Look!
MSI needs to URGENTLY improve the quality of the applications in the Dragon Center and stop messing around. I definitely discovered my problem. I already know how to force WHEA here and then I did it by elimination. He was a driver !!! Some driver that MSI inserts to further control the PC's sleep state. I have S2 / S3 / S5 enabled in the BIOS. S5 is a new state and already accepted by Windows 10. With Dragon Center, my computer is almost completely dormant. And this is where the WHEA problem occurs. Something is not working well with this on the DC and generates these errors. I removed the DC and everything returned to normal. I don't have this more advanced state of rest of the S5 that I considered to be VERY cool, but at least I have the reliable computer again.

Suggestion I give to friends. If you have a support app like Dragon Center on your mainboard, remove it. The most current version at least manages to completely uninstall itself. The most current version is 2.0.100.0. They need to review this urgently.

I'm out of control of the LEDs now.

I make the suggestion that I went through, as it may actually be a driver on your system that is causing these anomalies. In some cases the problem is located in the BIOS. I find WHEA disgusting because it generates a huge false positive for us, making it impossible for us to quickly find the problem. In summary, my Corsair RM-1000 font had no problems and I ended up buying another one without any need.

NOTE: The boot and the system are more agile without the Dragon Center.

Thank you MSI!

[ AMD Ryzen 9 5900X | XPG LEVANTE 240 | MSI MPG B550 GAMING EDGE WIFI (MS-7C91) - 7C91v16 | 32GB DDR4 3600MHz XPG SPECTRIX | EVGA GEFORCE GTX 1650 SUPER 4GB | XPG CORE REACTOR 850W | SSD 970 EVO NVMe M.2 250GB ]
0 Likes

Esteemed,

Final test carried out and the problem is in fact in the VGA RX5700 XT.

I tested the VGA on another computer and had the same problems. When I tried to leave only the standard driver downloaded from microsoft, it behaved a little better. However, when I tried to uninstall the standard driver, the PC crashed. I tried another 4 times in a row and the same bizarre crash behavior occurred. I put the MSI RX5700XT back on my PC and tried to do the same.

I took the RX5700XT from MSI and put in an ASRock RX550. Everything works normally, with Dragon Center included.

Apparently the problem is with the VGA in fact. After this exchange, everything normalized. WHEA-Logged ID: 18 events are over.
Requesting the MSI RX5700XT RMA.

[ AMD Ryzen 9 5900X | XPG LEVANTE 240 | MSI MPG B550 GAMING EDGE WIFI (MS-7C91) - 7C91v16 | 32GB DDR4 3600MHz XPG SPECTRIX | EVGA GEFORCE GTX 1650 SUPER 4GB | XPG CORE REACTOR 850W | SSD 970 EVO NVMe M.2 250GB ]
0 Likes

This topic has almost 50 pages. People with different mobos, GPUs and ram

Usually cou is the same

I returned my 5900x and bought 5950x.

Since then no single crash.

I use linux.

This thread has gotten a little quieter, just wondering if people are just dealing with the random WHEA's and just waiting for the next bios or have most of you started RMA? I just wanted to mention, today was my first whole day without a WHEA crash. I don't expect it to stay that way out of the blue. I just hate feeling like I'm walking on egg shells the whole time I use my computer, Just waiting for that crash to hit when you least expect it. Also, anyone with an Asus board heard any news about a new round of bios updates anywhere?

5900x | Asus TUF X570-PRO | 32GB GSkill DDR4 | EVGA 3090 FTW
0 Likes

In my case, I have a stable way to reproduce WHEA error - see https://community.amd.com/t5/processors/the-most-stable-way-to-reproduce-whea-error-on-5900x-or-5950...

I use Bios with Agesa 1.1.9.0 and the same errors were confirmed by others on Bios with Agesa 1.2.0.0. So for me there is no point to wait for an update - that is why I sent the CPU back to the shop’s service. 

0 Likes

I wanted to voice that I have also experienced these WHEA reboots.

 

My build is

motherboard: asus dark hero viii
processor: 5900x
memory: Crucial Ballistix 32GB Kit (2 x 16GB) DDR4-3200 BL16G32C16U4B.M16FE (running xmp)
psu: seasonic gold 750w
gpu: xfx 6800 xt

I'm currently running experiments. Im currently running optimized defaults with xmp to see if I will crash. Just catching up with this thread.