Mainboard: MSI x570 Unify
Mainboard-BIOS: 7C35vA82 (Beta version)
CPU: Ryzen 5900x
RAM: Crucial Ballistix BL2K32G36C16U4B 3600 MHz, 64GB (32GB x2)
Drive: M.2 Samsung 970 Evo+ 1TB SSD
Graphics: SAPPHIRE Nitro+ Radeon RX 5700 XT
PSU: be quiet straight power 11 750w Platinum
OS: Win 10 Pro (64bit) - all updates installed
Chipset driver: 2.9.28.509 (released 2020-11-09)
I first assembled the PC with a Ryzen 3800x a week ago because it was unclear if and when I would get the Ryzen 5900x I ordered. Worked with the included AMD Prism Wrath CPU cooler for one week without any problems.
- Today I installed a Ryzen 5900x and a Scythe Fuma 2 CPU cooler.
- After 20 min the first crash/restart with the following entries in the Event Viewer: WHEA-Logger ID 18 and critical error Kernel-Power ID 41.
- Happens irregularly again and again, sometimes after minutes, sometimes longer: Windows freezes for a few seconds and then the PC reboots. Doesn't matter if load or not.
- CPU temperature between 30 and 40 °C
- Updated to BIOS and chipset driver mentioned above: Problem still exists
- XMP Profile disabled (RAM on 2600 MHz): problem still exists
- CMOS Reset: Problem still exists
Either there is a compatibility problem of something with the CPU, or the CPU is defective?
What to do? Really frustrating.
Solved! Go to Solution.
Im having a similar issue, x570 aorus and 5600x. Have same errors on windows.
Disable CBP and PBO and run it at default settings (3.7 ghz and xmp on). That works for me.
I got a new angle on this. So deactivating PBO and CBS definetely works, PC was running stable for a week now. But you'll loose performance.
So I wrote to the MSI support and the AMD support.
MSI suggested to try increasing the DRAM Voltage by 0.05 V, which I did. System seems to be stable, no crashes so far - neither in idle or while gaming.
I found this when I was doing some research again last night: [Resolved] Kernel Power 41 Critical Error on Windows 10 (thegeekpage.com)
It not exact on how to get to change your disc settings to never sleep but I found it nonetheless. I changed mine to never and was able to play about an hour on fortnite no crashes and 30 mins on cyberpunk with no crashes. I'll keep persevering with this **bleep** CPU
This is the info for those whose system could be stabilized by setting the Curve optimizer to a positive number.
I could stabilize my 5900x system with all-cores Curve +8, but I tried to find a way to minimise the overvoltage and the overall harm to the performance. Some expert suggested me to try setting the Curve on a per-core basis.
So I started with checking the Core Performance order with the HWINFO64 utility. It looked like this:
Core Performance Order: 2, 3, 6, 4, 5, 1, 11, 8, 9, 10, 7, 12
All the first chiplet cores go first in the order, so I assumed that the 1st chiplet is better than the 2nd.
So I set the curve per core as following: cores 1-6 (0-5 in the BIOS) = 0, core 7-12 (6-11) = +6. And the system has been stable so far. I could get back some performance and remove unneeded overvoltage on the 1st chiplet.
I'd like to track the problem down further to some specific crappy cores. The expert suggested a dichotomy method (setting the curve to 0 to a half of the remaining cores), but this can be problematic as more than one core can be erroneous.
I checked the info on multiple BSODs I had before the fix (like two dosens of them). The APIC ID numbers of the BSODs were: 0, 8, 9,10, 24.
So I have a question: is it possible to find out the numbers of cores which caused the error by these APIC IDs? I know that APIC ID is a logical core number set by Windows, but they don't directly correspond to core numbers. Is there a rule or an utility for this?
Thanks for this information!!!
I have a 5950x which is exhibiting the exact same behavior.
I had managed to get it semi stable by simply using windows power plan to set maximum processor state to around 85%. However, after still getting the odd restart when a single thread would spike (HWinfo) I dove into the Bios and set the curve optimizer to +4, then +6 and finally stable (touch would) at +8.
During Cinebench R23 testing where I can pretty much force the restart by spamming the start/stop buttons between multi core and single core workloads +8 seems to be fairly stable and I have not had a single restart since (although maybe 1 but I'm unsure as it was rebooting at the time anyway). During Cinebench I'm getting around 4.4Ghz all core and 4.8Ghz single core.
Using the latest post I foundHW info shows my cores are preferred Chiplet 1 then Chiplet 2, with 1, 3, 7, 5, 0, 4, 6, 2 & 9, 8, 12, 10, 14 ,15, 11, 13.
Looking at the maximum clock speeds of each core
Chiplet 1:
1 & 3 are hitting 5025Mhz,
7 is 4900Mhz,
7, 5, 4, 6 & 2 are hitting 4825Mhz.
Chiplet 2:
9 & 8 4750Mhz
12, 10, 14, 15 & 11 4725Mhz
13 4700Mhz
So my question is how to interpret this data (i.e. what is the +8 Scalar actually doing). Is the reboot caused by a core going over its predetermined clock speed, or is it because it's not getting enough power to hit its desired clock speed and crapping out.
If the latter, can I assume Core 13 is a dud core and is potentially the only one requiring the +8 Scalar? whilst 12, 10, 14, 15 & 11 could go with a +6, whilst 9 & 8 with a +4 and chiplet 1 +0 or even a negative number?
I appreciate I will ultimately need to just go ahead and test these for myself, but being completely green to all this I wanted to understand if my fundamental understanding of the issue is correct before chasing my tail and going in the wrong direction.
Thanks in advance
Wilson
@willeywilson wrote:So my question is how to interpret this data (i.e. what is the +8 Scalar actually doing).
AMD did a good video of how curve optimizer works on PBO2:
https://www.youtube.com/watch?v=2Jo2ck6xzDM
The scalar unit is a range approximately 3-5mV for each unit. The reason there is a range is the CPU will determine the best over/under voltage to use depending on where you are in the voltage/power curve.
The scale goes up to +/- 30 (90-150mV). If you are +8, the scale is overvolting the core by a range of 24-40mV.
AMD explains how it works when you undervolt. There will be less undervolting when load is higher, more undervolting when load is lower. So, in our case using it to overvolt, there will be move overvolting when load is higher, and less when load is lower.
With my MSI X570 Unify and Ryzen 5950X I've used the A82 Bios with PBO and CBS disabled to get the system stable. I've had exactly the same problems as TS, sponteneous reboots at idle.
I've now been running the A84 bios at stock settings with CBS and PBO enabled for 12 hours without reboots, something that was impossible with older versions, so that seems promising. You can download it (at your own risk of course) here: https://we.tl/t-AHQ55LAvfs
Where is that A84 MSI bios coming from? It's not even on the offical sites
I've got it from someone with connections from another community that helped me in the past with an A81 Bios (that didn't solve my problems). But I understand your reluctance. Here's my validation on CPU-Z: AMD Ryzen 9 5950X @ 4423.97 MHz - CPU-Z VALIDATOR (x86.fr)
May I know what's the AGESA differences between A82 and A84?
Also, looking at the CPU-Z info, it seems that you CPU is overclocked with x44.25 multiplier on all cores, which means to say that you will not get the single-core 4.9Ghz (PBO disabled)
Correct me if Im wrong. Trying to make sense of this data. My CPU-Z has x36 multiplier during idle (supposed to be x37, but for some strange reason it's lower...) It does boost to single core x48.
@imraneo wrote:May I know what's the AGESA differences between A82 and A84?
Also, looking at the CPU-Z info, it seems that you CPU is overclocked with x44.25 multiplier on all cores, which means to say that you will not get the single-core 4.9Ghz (PBO disabled)
Correct me if Im wrong. Trying to make sense of this data. My CPU-Z has x36 multiplier during idle (supposed to be x37, but for some strange reason it's lower...) It does boost to single core x48.
I don't know the differences. The only thing I can determine is that both are using AGESA 1.1.8.0
My CPU is not overclocked, the only changes I made from stock bios is enabeling PBO, XMP and configuring a custom fan curve. CPU-Z is giving the cpu a multi thread load while validating, that's why you see the all-core 4423.97 MHz frequency. I hit 4.9-5.0 Ghz singe core with Cinebench R20 with the same settings.
@thunk_stuff wrote:
@willeywilson wrote:So my question is how to interpret this data (i.e. what is the +8 Scalar actually doing).
AMD did a good video of how curve optimizer works on PBO2:
Thanks @thunk_stuff for the link. I had a play around this morning. The PBO Scalar appears to do nothing for me overall, auto sets it to x7, I have tried x8 & x10 and seen no difference in Cinebench R23 scoring or clock speeds.
In the end I think I have found a stable workaround with the following:-
Core Max Speed Curve
0 4825 +2
1 5050 +0
2 4825 +2
3 5050 +0
4 4825 +2
5 4825 +2
6 4825 +2
7 4925 +2
8 4750 +4
9 4775 +2
10 4750 +4
11 4725 +6
12 4750 +4
13 4700 +6
14 4750 +4
15 4725 +4
I was hoping to achieve higher speeds on multicore R23 testing (27808 achieved at 4.325Ghz) and more cores on chiplet 1 to hit max speed on single core testing (1556), but these scores seem about right for the processor and will allow me to get some work done whilst waiting for the next bios update.
One other thing to note is that PBO in Advanced AMD Overclock settings on ASUS Bios is overwritten by the PBO setting on the AI Tweaker area of the BIOS.
Fingers crossed there is a new BIOS on the Horizon for my ASUS board as per the MSI un-official one mentioned above.
@tim716 wrote:This is the info for those whose system could be stabilized by setting the Curve optimizer to a positive number.
I could stabilize my 5900x system with all-cores Curve +8, but I tried to find a way to minimise the overvoltage and the overall harm to the performance. Some expert suggested me to try setting the Curve on a per-core basis.
So I started with checking the Core Performance order with the HWINFO64 utility. It looked like this:
Core Performance Order: 2, 3, 6, 4, 5, 1, 11, 8, 9, 10, 7, 12
Looks like you are a bit luckier than me. I tried setting one CCX to 0 and the other to +8 (and vice versa), and it WHEA crashed both times within a couple of minutes of loading Windows. Only stable setting was all cores +8 (stable for 2 weeks). This was before I knew you could go into NWINFO64 and get core performance order. Nice find.
My chip is currently in Georgia and on its way to Miami for RMA. Hope to get a new chip and see if that solves the WHEA crashes. Will report back on how it goes.
@Podersen wrote:Returned my 5900x because of me having all the same issues as aforementioned no whea error 18. Just kernel 41.
Bought a 3700x and I don't know how but that chip is displaying the same behaviour. This is on a new X570 unify and gskill 3600 cl16. I don't think there is anything wrong with power. So am I just so incredibly unlucky or is every amd chip like this? I just came over from Intel.
I had the same error on my 3700x.
Background: As builders, we are taught to install the latest BIOS for our motherboard, along with the latest chipset and device drivers. We do this without thinking. IMO, the WHEA (18) error reported in this thread is not likely to be hardware related. After all, AMD does not fabricate its own chips. AMD's fabrication is done by TSMC, the best in the business. The likelihood of faulty silicon is almost nil. Back to the BIOS, which contains the APIC (Advanced Programmable Interrupt Controller) microcode. WHEA errors can also be caused by faulty microcode. More importantly, over the last few months, since the BIOS and chipset drivers were updated to be compatible with the 5000 series CPUs, AMD and the MB vendors have introduced faulty microcode into the environment. In other words, by updating to the latest BIOS as a matter of course, we, as systems builders, are introducing errors that typically do not exist for older, tried and true, CPU solutions like the 3700x.
Solution: Roll back the BIOS to an earlier stable version that supports the 3700x, or try a newer BIOS that contains AGESA 1.1.0.0 D. In my case, I rolled forward to Gigabyte BIOS F31o (AGESA 1.1.0.0 D). I also use AMD Ryzen High Performance power profile with SLEEP set to NEVER, which may prevent WHEA (41).
Good luck!
=MacNZ=
AMD design the chips and send tapeout data to TSMC to fabricate.
A design problem could occur.
Or a fabrication problem could happen.
@MacNZ wrote:
@Podersen wrote:Returned my 5900x because of me having all the same issues as aforementioned no whea error 18. Just kernel 41.
Bought a 3700x and I don't know how but that chip is displaying the same behaviour. This is on a new X570 unify and gskill 3600 cl16. I don't think there is anything wrong with power. So am I just so incredibly unlucky or is every amd chip like this? I just came over from Intel.
I had the same error on my 3700x.
Background: As builders, we are taught to install the latest BIOS for our motherboard, along with the latest chipset and device drivers. We do this without thinking. IMO, the WHEA (18) error reported in this thread is not likely to be hardware related. After all, AMD does not fabricate its own chips. AMD's fabrication is done by TSMC, the best in the business. The likelihood of faulty silicon is almost nil. Back to the BIOS, which contains the APIC (Advanced Programmable Interrupt Controller) microcode. WHEA errors can also be caused by faulty microcode. More importantly, over the last few months, since the BIOS and chipset drivers were updated to be compatible with the 5000 series CPUs, AMD and the MB vendors have introduced faulty microcode into the environment. In other words, by updating to the latest BIOS as a matter of course, we, as systems builders, are introducing errors that typically do not exist for older, tried and true, CPU solutions like the 3700x.
Solution: Roll back the BIOS to an earlier stable version that supports the 3700x, or try a newer BIOS that contains AGESA 1.1.0.0 D. In my case, I rolled forward to Gigabyte BIOS F31o (AGESA 1.1.0.0 D). I also use AMD Ryzen High Performance power profile with SLEEP set to NEVER, which may prevent WHEA (41).
Good luck!
=MacNZ=
Alright man! I am going to be rolling back to a bios before the new chips. Will give update, because right now I am having issues getting into windows because of freezes on 3700x with latest drivers.
Wish me even more luck🤞
Alright its me again (sorry for spam). I have now tested 3 different bioses from newest to like a year old. The problem is the same with on my 3700x as it was on the 5900x. Keep shutting itself off or freezing. I have tried everything and the only thing that helped ( in both instances) was turning down VCore to 1.1v
This still makes it crash eventually with both cpus, but it helps. I am starting to smell a motherboard (X570 Unify) RMA. Do you guys smell it too?
Do you guys think this is the sole perp?
I'm not convinced the Mother Board can be the issue given the amount of us having issues on brand new as well as previously working boards.
Your case is a little bit different mind. How confident are you the the bios was successfully rolled back and every trace of the latest AGESA was wiped? 🤷
@j96j wrote:
@imraneo wrote:OK guys. I've had some progress.
Im using Asus Strix X570-F BIOS 3001. Stock settings, didn't solve my reboots. So I finally narrowed down to these 2 settings which worked:
1) Disable Global C-state control
2) Vcore at 1.1V
All this while I've been over-volting. In fact I should be under-volting. In auto settings, the default voltage is 1.44V.
I have narrowed down the above 2 settings to be relevant for this fix. I've had other changes slowly removed as they're not needed.
More info:
- Idle temp: 44 deg (3.6Ghz)
- Single core boost: ~51 deg (max boost 4.82Ghz)
- All core boost: ~71 deg (max boost 4.52Ghz)
- XMP 3600 turned ON
Idled for about 10hrs straight (this is where reboots happen almost immediately, so this gives me huge confidence). Also ran some CineBench stably.
Still keeping my fingers crossed tightly. I think I'm getting what I paid for. Please share if this helped.
Also, is there any issue if I disable C-state control? I read this is a power saving feature, but I'm pretty new to this.Cheers.
Using 5800x, MSI x570 tomahawk (latest chipset driver and BIOS). At stock settings, PC reboots at idle or when applications are running. Tried disabling RAM's XMP, CBP, PBO, global cstate control. However, pc still crashed with same error.My only temporary fix is to set my CPU core voltage to 1.3v. Tried 1.25v and I can finish RDR2 without any crashes. But when I play cyberpunk, it crashes. So I use the 1.3v.
However with this setting of CPU core voltage (offset mode 1.3v), my CPU clock speed is running at base clock (3800 MHz).
Is there any way to set CPU core voltage to 1.3v and run CPU clock speed to auto?
Did you try just disabling C states and playing around with Vcore? Try it.
With CPB off, your CPU is gonna remain at base clock. I believe the Vcore is something which needs to be experimented with various values... and my guess if you should go lower than stock. My stock at Auto was 1.44V and I'm running at 1.1V for stability.
Disabling C states will prevent it from "sleeping" and going into ultra low clocks. This helps too, but I'm guessing I'm using more power when using the PC.
As for XMP.. I never really made a difference in stability to me. So I keep it on.
Update to my issue:
I decided to RMA my 5800X 4 weeks ago. New one just came in.
New 5800X have been FAR more stable than the last one.
At stock settings, old 5800X can reboot during windows installation and idle. The new 5800X with XMP enabled (3600c18 RAM), have been running stable so far. Finally I can continue my work. The old 5800X wasted 1-2 months of my time, from tinkering with BIOS settings and waiting for BIOS updates.
Will test more with this new CPU later.
Hey j96j
did u update windows before new Cpu
i think w10 19042.844 fixed lots of WHEA error.
new Asus bios 3405 and new AMD chipset driver 2.13.27.501 was fixed my random restart 1 month ago
Just wanted to check in with those still suffering WHEAs. I got my replacement CPU yesterday and rebuilt my system. I had no crashes or errors. I did not have to set any of the bios "fixes" commonly used on this forum to improve stability. So far this replacement CPU feels like what we should have all gotten from the first go. It's too bad there are so many faulty chips out there. On a curious note, this replacement I got was from batch BG 2104PGS. Anyone one else have a chip from this batch? And what is your experience so far with it?
Thanks for the reply.
( a small tip for everybody who have constant crashes.. check if your cpu cooler mounting bracket is touching any componments on the motherboard.. I found that my castle 365 v2 AIO water cooling could touch the codensators and the top m2 nvme harddisk and somehow create system instability.
once i removed the cpu cooler form the system and mounted another no more blue screen or crashes.
However i belive my cpu is defective no matter what i have manualy ented 1,4volt for cpu vcore instead of auto.. els the pc will still crash.. BUT at least by BSOD and crashing in games is gone and all windows errors, but its a difficult problem to spot, since is it the ram speed that effect he cpu crashes or is the cpu defective.. well hmmm so many bios settings to try )
Yes, already tried Windows Update. Re-installed Windows and updated them too. Tried ALL BIOS versions. I troubleshooted with the CPU for 2 months, so I have tried almost all suggestions on all forums.
I'm pretty sure I have the lowest binned chip though, because even with PBO and CBP disabled, my pc still reboots.
Now my new 5800X is still running stable. Even with RAM's XMP enabled to 3600 mhz. Note the only part I swapped was the old 5800x to new 5800x.
Yes there is. You have to go into the Advanced AMD Overclocking menu and change your P-State. It’s under the CPS menu, the Performance tab. P0 State set to Custom. Then you can choose the frequency “3700” and the VID. For a VID of 1.3V you have to put the numbers 28. “28” is the BIOS VID Hex Value.
Yes there is. You have to go into the Advanced AMD Overclocking menu and change your P-State. It’s under the CPS menu, the Performance tab. P0 State set to Custom. Then you can choose the frequency “3700” and the VID. For a VID of 1.3V you have to put the numbers 28. “28” is the BIOS VID Hex Value. Be sure to leave everything else alone and leave everything on Auto. I think the minimum voltage is a value of 3A or 1.1875V, I tried 3C but the computer turned it on me.
Using 5800x, MSI x570 tomahawk and had the same issue. All i did to fix was, set PSS to disabled. Now running stable even undervolted with single core boost ₊200.
@Lukes, I've done the same thing on my ASUS X570-F and it seems to have fixed it. No WHEA/BSOD for the last couple of days. Not had a chance to fire-up some games but hope to do so this weekend. Will post again if the issue comes back when gaming.
Happy to help. Weird that i discovered it randomly and havent find any additional info related to this PSS topic.
@Lukes @ace50k Disabling PSS would be akin to disabling C-States. It might work around the issue but your machine is meant to work with this enabled you shouldn't have to disable a feature to get it to work.
If it's a new system you might need to replace something, if it's old maybe a firmware roll back or a future firmware might help.
Have you tried AMDs troubleshooting I posted a few pages back? It probably won't make your system work any better than it is now, it's fine to leave PSS disabled, but you might gain insight into wether to need to roll back a bios or swap a component.
Thank you, I will try that. However I didnt mention, that my system si perfectly stable in stock settings. I probably only pushed undervolting + overclocking too far. I am getting much better benchmark results and all is stable however i get random restarts at idle twice a week. Dissabling PSS seems to help my system stabilized. I havent seen any side effects like rising temps.
@Lukes thanks for letting us know it's not occurring during stock settings. I assumed everyone was using stock settings.
If it's due to undervolting there's some curves you can adjust but it's probably easier just sticking with what you have. If this doesn't work for you try disabling C-States entirely or tweaking your undervolting.
Thanks for mentioning the 1.1vcore. Helped me on a 5900x and X570 Unify. Do you have any other tricks you could share?. Also do you run xmp while downclocking your cpu?
I have the same issue. A completely new PC, literally every component. I only have the Kernel Power 41 issue that reboots the computer at seemingly random. It seems to be mostly when the PC is under low load. I have done literally everything from reinstalling windows, selectively installing and uninstalling drivers, unbuilt and rebuilt my PC, tried new power supply EVERYTHING. Now I changed the processor to a 3000-series 3800 and everything is smooth sailing.
Consensus from investigation: My 5950X (and seemingly many 5900 users as well) does not work as advertised at no overclocking. I did not tweak any overclocking setting. I am going to return my processor and hopefully have better luck with the next one I get. Really disappointing and my new PC glow is entirely gone.
Question: Has anyone had luck, or heard anyone with luck with the 5950X or 5900X? This thread is very populated and many might have the same issues.
I have this issue too, it seems extremely widespread, looking around on various overclocking forums around the world you'll see plenty of people having WHEA errors and random reboots. To me it's baffling that none of the reviewers magically never had these issues yet its a massive widespread problem when the product is released.
To me that makes it clear, reviewers get super perfectly constructed silicone handpicked by AMD and heavily tested before shipped, and the rest of us get scrap metal, whatever is over, mass produced garbage that you get to pay 1000$ to draw your shot at the untested silicone lottery. Absolutely shameless.
My problems with WHEA started at random, my system was fully stable for over 5 days, i did all kinds of stresstesting to make sure it was stable. Suddenly my event viewer out of nowhere started flagging whea errors, as far as i can tell they seem to be coming and going at random. No setting that i've changed seem to be helping, stock setting, overclocking, underclocking, nothing really helps at all, the WHEA errors are there.
The only positive thing i have to say is that my WHEA errors seem to not lead to any actual crashing, which is great! My WHEA error IDs are 10 and 11 which is different from most people i've read about. Anyway, i hope whenever the AMD ryzen 5000 beta is over this will all have settled.
Same issue here... Brand new build, Asus Crosshair VIII Hero 128GB 3200 RAM, 5950x. WHEA Cache Hierarchy errors, random reboots when doing simple things like opening Chrome. RAM is even running at 2400 mhz currently to rule that out, but seems like it doesn't matter.
I have a 5800x as well that did not have the issue for the three weeks up until my 5950x arrived.
Solved should honestly be removed from the post title. Turning off Core Performance Boost indeed seemingly solves the reboots, but we should be able to use the CPU as designed by AMD.
My guess is that the yield on these cpus is so horribly low that they are sending out whatever they have to get them into peoples hand, quality control ignored.
do you have windows on a m.2 drive? I'm wondering if this is causing the issue too? I went back to an older Bios for my X570 taichi. I'm using the 5900X CPU and my m.2 is the Adata XPG SX8200.
When i originally built my PC i used an older Sandisk sata SSD and just booted off that and then did an install to my m.2. it didnt work and had a few issues. i was originally using the patch c I think for the bios. kept getting these kernel issues. so i went back to an official bios and reformatted my m.2 and then did a clean install for windows. My PC was running perfectly for a week. All bios settings were set to auto and i didnt adjust anything. I had no kernel P shutdowns. no WHEA loggers. I thought I had fixed it. I left everything on AUTO. then bang its back again. out of no where. tried to play fortnite and had two shutdowns with the kernel error. Finally got back in and played for a good few hours with no issues. but again tonight errors and shudowns are back randomly. its infuriating.
The reason I ask about the m.2 my issue was its not showing as the first option as the boot drive...
so now I've checked bios again after my issues are back and again for some reason my boot options are set for my Samsung EVO SSD which is my gaming drive showing as a windows boot manager... wtf I've never added a boot manger or installed anything windows related on this drive before?> if i change it back to my M.2 as first boot option it wont load whatsoever. says insert drive and hit a key... I changed it back to windows boot manager and now i can get back in...
I've never owned a m.2 drive before - is this normal? its installed in slot one on my board.
I'm just confused and I didn't want to do the new bios but looks like I'm going to have to try them after all since the computer has gone back to being unstable. There was a recent windows update and since then this is when the errors came back. i only just installed December 8, 2020—KB4592438 (OS Builds 19041.685 and 19042.685)
This security update includes quality improvements. Key changes include:
it installed on the 11th - so this could also be a reason why its now causing this issue again. First issues returned on the 12th for me.
A bit more info for my system...
I have a 1080ti, and windows on a pci-e 4.0 m.2 (Sabrent Rocket)
started having random shutdowns on a x570/3900x system that worked fine before. i posted about it in another thread, but it might be relevant here as well
https://community.amd.com/t5/processors/new-ryzen-3900x-x570-random-restarts-whea-logger-error-id17-...
Well, increasing the DRAM voltage did indeed stabilize the system for me, but the error is not completelly gone. Had 3 random crashes & reboots with the WHEA ID 18 error after 1 week. All within 2 hours, all at idle, and after that it's been stable again. Very strange.
I contacted AMD customer support but all they answered was to send them a few system files and haven't heard back yet on this issue. They suggested trying a different graphics card, but according to this thread that won't help. Haven't tried it yet.
MSI customer support told me it will take a while for them to release a BIOS update.
Did you try increasing cpu voltage via the curve optimizer in your bios? Seems to solve the issue for some users here.
I replaced my MSI RTX 3080 GAMING X TRIO by a KFA2 GTX 1070 (my old one), and no problem since that (running for 2 days now)
So I have a simple a question for other users facing the same issue: do you also have a RTX 3080 ? Could it be graphic card related ?
Note that the RTX 3080 rans perfectly fine in another computer for more than a month with no crash at all so I don't think the graphic card is having some hardware issue. But it seems to have an impact on computer reboots.