cancel
Showing results for 
Search instead for 
Did you mean: 

PC Processors

CrispyCrunch
Adept II

Ryzen 5900x: System constantly crashing/restarting WHEA-Logger ID 18 and critical error Kernel-Power

Mainboard: MSI x570 Unify
Mainboard-BIOS: 7C35vA82 (Beta version)
CPU: Ryzen 5900x
RAM: Crucial Ballistix BL2K32G36C16U4B 3600 MHz, 64GB (32GB x2)
Drive: M.2 Samsung 970 Evo+ 1TB SSD
Graphics: SAPPHIRE Nitro+ Radeon RX 5700 XT
PSU: be quiet straight power 11 750w Platinum
OS: Win 10 Pro (64bit) - all updates installed
Chipset driver: 2.9.28.509 (released 2020-11-09)

I first assembled the PC with a Ryzen 3800x a week ago because it was unclear if and when I would get the Ryzen 5900x I ordered. Worked with the included AMD Prism Wrath CPU cooler for one week without any problems.

- Today I installed a Ryzen 5900x and a Scythe Fuma 2 CPU cooler.
- After 20 min the first crash/restart with the following entries in the Event Viewer: WHEA-Logger ID 18 and critical error Kernel-Power ID 41.
- Happens irregularly again and again, sometimes after minutes, sometimes longer: Windows freezes for a few seconds and then the PC reboots. Doesn't matter if load or not.
- CPU temperature between 30 and 40 °C
- Updated to BIOS and chipset driver mentioned above: Problem still exists
- XMP Profile disabled (RAM on 2600 MHz): problem still exists
- CMOS Reset: Problem still exists

Either there is a compatibility problem of something with the CPU, or the CPU is defective?
What to do? Really frustrating.

2 Solutions

Im having a similar issue, x570 aorus and 5600x. Have same errors on windows. 

Disable CBP and PBO and run it at default settings (3.7 ghz and xmp on). That works for me. 

View solution in original post

I got a new angle on this. So deactivating PBO and CBS definetely works, PC was running stable for a week now. But you'll loose performance.

So I wrote to the MSI support and the AMD support.

MSI suggested to try increasing the DRAM Voltage by 0.05 V, which I did. System seems to be stable, no crashes so far - neither in idle or while gaming.

View solution in original post

947 Replies

Hi!

is not a CPU problem mostly is a chipset mix problem. CPU/Chipset and mobo manufacturing

sometimes is fixed with a cpu change, other changing the mobo/chipset.

Not all X570 mobo have this problem WHEA/reboot/BSOD

0 Likes

That's what I'm starting to think since the issues have appeared/disappeared with various AGESA versions.  Probably a combination of AGESA and motherboard-specific firmware in my opinion.  If it was a widespread CPU issue I would of expected numerous tech sites to be reporting it.  Doesn't make it any less annoying for those suffering issues, though and my previous Intel systems have been rock-solid.  Mind you, my 2700X and 3900X systems (the 3900X installed on the same ASUS board) have not missed a beat except when setting DOCP with 4 DIMMs.

0 Likes

Oh yes! intel is a rock, but this issues, spontaneous reboot (WHEA error) and BSOD is a bad image

0 Likes

**bleep**.  I’m having the same issues!   Rma’ed my 5950x due to whea errors, now getting BSODs due to bug check with the replacement CPU.  Multiple reinstallations of win 10.  Did not help.  Come on AmD!  

So same was happening to me with 5900X after RMA.  Original processor had WHEA errors and RMA processor Bug Check and Kernel Power Down errors.  I re-installed windows and was still having issues.  Some guy on microsoft forums suggested I run "Windows Memory Diagnostic" tool that is installed with Windows 10.  When I restarted the computer and the tool started running it immediately flagged that I had a bad memory stick, potentially 2.  I replaced memory and the computer has been rock solid now for a few days.  Hopefully this helps you out as it helped me out.  Previous memory was Gskill 3600mhz memory... new memory is 3200 memory from Corsair.  I hope that helps 

0 Likes

what is your mobo?

0 Likes

My motherboard is Asus Tuf X570-PRO (WIFI) on latest beta bios.  (3603)

0 Likes

For a unknown reason  the high end Mobo are more reliable and this error is not present (mainly). Many errors WHEA and reboot are present  in medium/low mobo. I have a MSI X750 Carbon Pro WIFI and work well but with 5600X.

I'm so tired about this issue, can work (reboot is occasional)  but cannot play ( reboot after few minutes)

 

"I'm so tired about this issue, can work (reboot is occasional)  but cannot play ( reboot after few minutes)"

Try to turn off Core Performance Boost.
If it helps - you can RMA CPU (that's the best approach) immediately
or use this script:
https://www.overclock.net/threads/single-core-prime95-test-script-for-zen-3-curve-offset-tuning.1777...
to set positive voltage offset for cores to make them stable (and get the CPU which is slower than the default one)

0 Likes

hi!

test is stable for now. Trying more time with default parameters.

0 Likes

Hi!

system stable at 3200(RAM) running the recommended test and y-cruncher. Trying at 3600(RAM)

Is possible some conflict with the GPU 5600XT? I read in the forum about this possibilites with the WHEA error.

 

Thx

0 Likes

Here is what I did last time:

  • bios reset twice
  • enabled virtualization
  • enabled support for DDR4 @ 3600
  • uninstalled Gigabyte monitoring software
  • uninstalled HWI_642


And the results :

azomiss_3-1617730857919.png

 

Have a great one, me happy!

0 Likes

Deleted.  Repost

 

 

0 Likes

Deleted. Repost. 

0 Likes
roopy
Journeyman III

Experiencing the great WHEA errors bsods and black screen reboots... purchased the CPU back in November 2020 and only just installed it this month May 2021. Crashes only while idle and has not crashed while playing games (everything stock). Already submitted my RMA request.

BNITRO
Adept III

I would check the motherboard, I love MSI dont get me wrong, but I have a B450 that could of died because the VRM heatsinks were loose...

MSI B550 Nitro owner.

0 Likes

Mine died again after a week running AGESA 1.2.0.3a. Yeah it's far more stable than in the past but it's still not stable.

0 Likes

Hi there!

I have a similar problem with "Asus ROG STRIX B550-F Gaming Wi-Fi" and Ryzen 5 5600X:

PC is running fine for many hours, even with "heavy" games, except:

* Can't start Elite Dangerous
* Can't start Star Citizen
* DISK WRITE ERROR in Steam
* INVALID DEPOT CONFIGURATION error in Steam
* Unable to extract 7z files with password (with 7zip)

All issues are suddenly gone when I disable Core Performance Boost (CPB) in my BIOS settings ...

But CPU stays at 3.7GHz then of course.

Playing around with enabled CPB, but disabled PBO (Precision Boost Overdrive) didn't help either.

0 Likes

@Hurt check page 68, I posted some basic troubleshooting from AMD. I'd already done most of it myself but it was written up better than I would write it. It's a good place to start but it didn't fix it for me.

If that doesn't work then you're looking at a hardware issue or a BIOS or driver fix that doesn't exist yet.

You'll then need to swap out your hardware and test it separate to determine the cause. If you can reproduce the issue quickly it shouldn't take long to narrow it down.

0 Likes

Already written in another topic about my issue but no one responded, and in the meantime i have further updates that could provide useful information for the community so im typing here as well. Last week I'have experienced random reboots triggered by Whea LOGGER 18 error as well (preceded by a whea logger 19 warning, corrected hw error).
I remember  I first got this error when I changed CPU, I had a R5 3600 and got a 5600X. After disabling PBO and set manual cpu multiplier at stock boost frequency and auto voltages my cpu ran fine (not exceeding 1,28V) for some months, even when i changed GPU (from a 5700xt to a 6700xt) and PSU (from a Corsair 600W 80+bronze to a full modular seasonic fx 750W).  So when new gpu driver released 21.5.2 i installed it and one day the PC fell into the windows auto repair loop. After restoring my system I only lost programs on C that I quickly reinstalled and everything seemed to run pretty fine, i did the same for the bios settings (including ram and cpu ones). After some days I had again whea logger 18 crashes preceded by whea  19 warnings (with the same RAM + CPU config that was running fine before windows restoring). Reading ALL of the comments onreddit and on this forum I did the following:

-Uninstalled and reinstalled latest Chipset drivers

-Updated graphic drivers

-Undervolted GPU

-Uninstalled Soundblaster drivers (i hear a crackling sound in my headphone sometimes when the system is about to randomly reboot)

-Uninstalled Afterburner, Hwinfo and rivaturner (noticed sometimes crashed when i tried to launch these programs while gaming)

-Changed Windows power plan

-Tried again PBO with curve optimizer negative 20.

-disabled spread spectrum feature in BIOS

-ehanced memory and cpu voltage by few mV

-Set typical current idle in BIOS

-Disabled Resize Bar and above 4g decoding (SAM )

-Set PCI_E to gen3

....

I still had WHEAs  warning and then whea 18 restarting pc. 

So I turned PBO OFF again and (this one thing had some impact) change RAM settings. Were running 3800 mhz CL16 FCLK 1800 now 3733Mhz with the same timings (even subtimings) and same voltages. 

WHAT HAPPENED IS , whea 19 warnings COMPLETELY disappeared but now i still have whea 18 coming randomly and causing pc to restart. 

At this moment the only things left to do are  disable  CPB (now on Auto) and maybe another windows clean install.  Maybe iI can also change a CPU power cable on the mobo in the process (just to be sure).

I  really don't know what else to do at this time.

Any help?

0 Likes

@Fastbreak I saw your response before in a previous thread. Same advice as above.

I gave my CPU back to the store today, the person I spoke to thought they knew a lot more than they did. Wrote down eveything I said completely opposite. Thought I was completely daft for saying it happens more at idle. Well I hope they still manage to reproduce it.

Anyway this is probably the most difficult issue I've had to narrow down in over 20 years. Mainly because It got a far more stable after a BIOS update and became hard to reproduce. If you can reproduce it then just have to swap each piece of hardware out one at a time and test it. It's likely the CPU but it could be several other things.

What's strange is.... The same Gpu-Cpu-Ram-Bios-Chipset/graphics driver settings that gave me stability for MONTHS reproduce this error after i restored windows 10 (had to do that because of the autorepair loop that i wasnt able to fix with troubleshooting and command prompt). Was thinking about getting new CPU and new MOBO. I am sure that my CPU and my MOBO would work fine in another build :D. it's just that i am so f**king unlucky

0 Likes

Back up your OS with Macrium Reflect or somwthing or just use a spare drive and try a fresh install of Windows.

I found it seems more unstable when Windows has updates available. However I was getting the issue even when I wasn't in Windows.

I found the issue was so random, I didn't see it for a few months so you just might not of encountered it for a while.

0 Likes

@Cmdr-ZiN 

About your guide from page 68:

* Update the system BIOS to latest version available from motherboard manufacturer (refer to motherboard user manual for instructions on updating the BIOS).
-> Already done before (latest non-beta BIOS)

* Set the BIOS to use factory default settings / optimized default settings (refer to motherboard user manual for instructions on restoring BIOS default settings).
-> Already done before (amongst trying other settings)

* In the BIOS, locate the Power Supply Idle Control option and set it to Typical (this option should be available in the Advanced section of the BIOS).
-> Not done before, set it to that value now.

* Update Windows to the latest version and build via Windows Update. For instructions, refer to article.
-> Already done before

* Update to latest chipset driver from AMD. For instructions, refer to article.
-> Already done before

* In Windows Control Panel, select Power Options and choose the Balanced (recommended) power plan. In Windows Settings, select Power & sleep and set the Performance and Energy slider to the middle.
-> Not done before, set it to that value now.

* Disable non-Microsoft services and startup items using the System Configuration Tool.
-> Already done before ("Clean Boot")

* Reseat CPU, RAM, and all PSU power connections (end-to-end for modular PSUs). For more instructions, refer the product’s user manual.
-> Not done so far

* Verify RAM sticks are installed in the correct DIMM slots (for socket AM4 motherboards with 4 DIMM slots, use A2 & B2).
-> Already done before

Issues stay the same - only when I disable CPB, everything works fine.

0 Likes

@Hurt not my guide it was AMD's although similar to what I had done. If you've done all that and it's still not working, then you have a hardware fault that not fixed by any BIOS or drivers yet.

(Definitely try the beta BIOS and reseat the parts)

Time to isolate which part is the issue by swapping them out and testing and get it replaced.

Buenas, otra ves de nuevo.

Después de la ultima actualización de Windows 10 la 20H2 volví a tener los Whea Logger Event 19, después de esperar varios meses con los 'Whea' actualice a la segunda beta: AMD AM4 AGESA V2 PI 1.2.0.3 Patch A aun así los WHEA no desaparecieron.

Revise algunos foros sugiriendo lo mismo de siempre, pero encontré algunos comentarios a decían subir el voltaje del CPU otros el CPU y SOC. Por desconocimiento no lo hice antes con la BIOS estable.
Así que subí el voltaje a +1, es decir:

- SOC: 1.025v, porque se aumenta en 0.0625 creo.

- CPU: 1.025v, porque se aumenta en 0.0625 creo.

- La RAM la modifique a 3000 MHZ(no esta demás decir que hice eso primero, pero seguía igual) al voltaje a: 1.25v

Antes de esto todo a fabrica, salvo la virtualizacion y para los discos 'no RAID'.

Reinicie y dejaron de mostrarse los whea-Logger, hasta ahora todo bien.

Mi PC

Tarjeta Madre: Asus TUF B550M - PLUS

CPU: Ryzen 5 3600X

RAM: 16 GB 3200 Mhz Corsair Vengance

GPU: Nvidia GT 730

PSU o Fuente de Poder: 600 Watts

0 Likes

@FirefoxNS I'm not raising my voltages, if it doesn't run stock I'm replacing something. Sure I could make more stable but maybe not perfectly stable, but I'd rather fix the dodgy part than reduce the life of a product unneccessarily.

It's quite probable your issue may return, it's become more stable with BIOS updates but still occurs for me. I guess if I had a more stable chip it probably wouldn't occur at all, some people have far less stable chips than me. Assuming it's the CPU and not something else.

0 Likes

I replaced my 5800X with a new one. I ran it for 17 days straight without any other changes from before. I had no issues during this time when before it would always show issues within 3 to 7 days.

I have also noticed my 3600Mhz memory has been taken off the QVL list. So a possible contributing factor.

I'm going to call my issue solved with a CPU replacement. If the issue does somehow return I'll definitely let you guys know in this thread but I think it's very unlikely I'll see this error again.

0 Likes


@Cmdr-ZiN wrote:

I replaced my 5800X with a new one. I ran it for 17 days straight without any other changes from before. I had no issues during this time when before it would always show issues within 3 to 7 days.

I have also noticed my 3600Mhz memory has been taken off the QVL list. So a possible contributing factor.

I'm going to call my issue solved with a CPU replacement. If the issue does somehow return I'll definitely let you guys know in this thread but I think it's very unlikely I'll see this error again.


I was in the same boat, replaced the CPU, and it was good... For about 3-4 weeks.  Then it started doing the same thing again (WHEA errors).  A few days after that, my 3080 died.  So... It's very likely it was the GPU all along even though the WHEA error indicates CPU.  The chances of getting 2 bum CPUs seems fairly low.  I'm waiting on a warranty replacement for the GPU and we'll see what happens from there.

If I still end up with problems, then it's likely the MOBO or PSU frying stuff, but we'll cross that bridge when we get there I guess.

0 Likes

I'm really not convinced that this would have anything to do with PSU, Mobo, or GPU. 

We all have different hardware but the thing we all have in common is new gen Ryzen CPU. 

Either manufacturers weren't able to properly configure their hardware to match AMD CPUs requirements, or the CPU just fails to be stable when boosted, which it is by default.

Ultimately, both options point to the CPU being too wild, unpredictable, and unstable in most setups. 

Basically, AMD is to blame for these issues as they probably tried way too hard to create an overclocked CPU that is unstable by default and it would be up to the others to adapt their hardware to make up for the instability... Which is nonsense.

Ive posted over 2 months ago about my issue and haven't had any BSOD ever since I disabled CPB & PBO. Clearly the CPU is unable to handle the boost and no manufacturers in their right mind would try and adapt their hardware to this instability. 

AMD tried way too hard to release a very performant CPU but it basically became a custom built "engine". 

We can try replacing every **bleep** parts in our PC to try to fix the issue, but in most cases I doubt anything can be done except get a different CPU or hope that a replacement Ryzen will be more stable. 

Big **bleep**in welp

0 Likes

I was running stable for about 3 weeks on a replacement 5900x before getting WHEA errors again.  Then I was stable for about another week after disabling CPB & PBO.  But then the GPU starting turning itself off as soon as it had to render any 3D engines (DWM errors).  Confirmed that the card was dead by running it in my old reliable i7 system. 

I've read that a faulty GPU can give these same WHEA errors, but who knows.  You may be right as well.  At this point, I'm questioning every piece of this build.  I've never had this many issues before.

0 Likes

HI!

faulty GPU can give these same WHEA errors, i replace a 5600XT for a 6700XT and dont have any whea and reboot after that.

5600XT testing in other system have the same error, reboot and black screen.

0 Likes

I don't believe the CPU to necessarily be the issue here.  In some cases it may be, but I believe it's more of a motherboard/bios/AGESA issue.  If I set my LLC to max on auto voltage or if I use a LLC that keeps my voltage steady or greater, I do not see these issues.  I've seen several people saying that setting their LLC higher fixed their issues as well.  My BIOS settings have had to change with each new AGESA release as well, making me feel stronger that this is the case.  5900x user here btw.  

If I set all of my settings to auto and enable PBO in any form or fashion, I will get this error within an hour of using my PC in anyway.  Sometimes it looks like the voltage dips too low and the CPU will error, while others it looks like a short boost will  make it error.  The boost happens before the voltage can adjust on auto settings and it errors out because the voltage is too high.  If I set my voltage to override + offset 1.35v offset -.1125 with LLC @ max and PBO settings @max, my CPU sits at around 1.26v Constant, boosts to 4.55ghz +, and never ever errors.  If I set everything to auto, enable PBO in any way, i get the errors.  I would recommend playing around with your LLC and/or voltage to find a nice low voltage, that is stable, and does not droop much if at all.  I bet  most peoples issues will be solved.  The ones this does not work for probably have another issue such as some kind of bad hardware or settings. Could be the CPU, RAM, or some other vital piece of hardware...Chipset or chipset voltages as well, RAM timings...all sorts of possible issues here.  Sometimes even stock timings, especially if XMP is enabled, can be very wrong and make your system completely unstable.  

0 Likes

Go back an read what I've previously written if you want more info on my case. I've been troubleshooting this for over 6 months. It does the same thing on a Nvidia 780 GTX, I've ruled out everything but the CPU, MOBO and RAM combination.

I agree it's probably LLC related and it has improved a lot with the BIOS update releases but it was never solved until I replaced the CPU.

As for the issue returning possibly in the future, I ran it straight for 17 days, I've always been able to reproduce the issue in 3 to 7 days in the past. However when not running my old CPU straight it would take few months to reproduce, I'm going back to typical use now and I don't expect I'll see the issue again.

I can also see it tends to load the cores differently, more evenly, It used not touch some cores for the most part. This more a gut feel than any impirical data but the new CPU seems better.

My current 5800X was made in Malaysia the original was made in China. While I could of tweaked my MOBO settings, my old CPU was never 100% stable at stock settings. Bad batches can happen, also I'm sure in a system with slower ram or less PCI Gen 4 components it might put less risk of instability on the CPU. I'd say many people would be unaffected by these issues, but for me the only solution was to swap out the CPU for a more stable one.

BTW the same thing happened for Ryzen 3000 owners and eventually was fixed with BIOS patches but many fixed with CPU replacements.

This is just what happens sometimes when you're an early adopter.

0 Likes

Hey guys,

I've been struggling with this issue for 2 weeks straight after building my pc,  i literally tried about 10 different solutions and none worked. Disabling CBS/PBO was the only thing that avoided crashes.  Thing is, who wants to spend that much money into a CPU to have it only working at the minimum ? 

I had crashes usually when the pcu was coming back to idle after an heavy load. 

Below is the fix that worked great for me, hope you guys can replicate it somehow, see below :

You have to have a "Curve optimiser" in your BIOS to do this. It's inside "Precision Boost Overdrive" section, you have to set it to Manual to show the settings and set them.

Set this:

EDC limit = 200A.

Curve optimizer = +4. 

Looks like it works for me. Of course your CPU might need more or less curve. You'd better start with like +4 - +6 and gradually raise it until the problem disappears (it fixed it at +8 for some).

If this works for many people, I can even give a conspiracy  theory, explaining this.

Looks like the AMD casino took the silicon lottery to a new level. 

The usual gambling used to be - how well you can overclock your CPU, but the base specified performance was guaranteed to you. Not anymore. Now, to make the Ryzen great again, the performance AMD specifies is the performance of an AVERAGE CPU. But of course that doesn't mean AMD is going to put a half of the CPU yield which is below that average down the trash and lose profits. That means a half of the buyers downvolts their CPUs to overclock them (the "awesome" new feature much advertised by AMD), and another half OVERvolts their CPUs to UNDERclock  them to make them work somehow.  This thread is the home of this second half losers. And, miraculously, these attempts to make this crap work voids the warranty, so AMD doesn't even have to take their crap back. Casinos never lose!

Of course this can be corrected by BIOSes (and will be, when AMD is tired of RMAs) by just raising the default voltages and/or cutting the turboboost (together with the performance).

Also it can be easily explained why the systems mostly BSOD or reboot at idle or some plain low load tasks, and remain stable under burn-in. The problem is not overheating, the problem is inability of a given crappy CPU to work stable at a given frequency with a given voltage. (just the same as if you undervolt it too much). The larger the frequency, the more chance of a BSOD to occur. The fully all-core loaded CPU works at LESSER frequences to stay within the TDP. But when you stop your burn-in and start to watch a video, just one or two cores (pre-heated by the previous burn-in) work, but they work at the MAXIMUM frequences. And - say Hi to a BSOD or reboot.

Thanks for the detailed post, @Anzu34 !

It'd be great to get your specs, in the thread, in case anyone is actually collating that stuff.

 

Mine has been at the shop for maybe a month now, and they've updated the BIOS to one of the more recent ones (hopefully the latest) and they've seen zero issues (apparently - and I have no reason to distrust them - running my spec of many drives, and timespy at 1440p 60fps but rendering at 4k) whilst stressing it.

So, hopefully AMD have been silently 'fixing' this issue without ever acknowledging that it was ever a thing.

It's frustrating, to be sure, and I *will not* accept a system that doesn't at least operate at stock specs ... which is, frankly, all I want and need from it. Without meaning to sound too assumptive, of course.

What @Cmdr-ZiN says keeps ringing in my head, though ... it doesn't matter how good things can look ... this issue *KEEPS* coming back. So ... hopefully my issues aren't WHEA related, but I dunno.

---

What I *will* be doing, though, is testing it at an older version of Win 10 than you're all likely running, to see if it is a feature of the issue. When I get this back, my main OS that I run Windows from for gaming I purposefully keep at a previous version of Windows because of very borked functions in recent updates. I still keep security and driver patches updated, of course, where required ... but in addition to the borking there's a number of other stuff I can't stand. For example, Windows keeps jettisonning(?) the security changes I make regarding the various ways it phones home.

Anyway, I dye cress ... because it's nicer in rainbow colours.

I'll report back when the box is back in my hands, mateys.

@eliotcole for some reason I don't get notified of thread updates anymore, but I did see the mention thanks.

My issue has been solved by a CPU replacement for who knows how long now. I'm not even on the latest BIOS as I kept every the same for a month to see if the new CPU worked with no changes. With everything working I didn't feel the need to upgrade the BIOS. 

For others it will be a GFX card, or PSU, could also be ram or mobo, these are the likely candidates. You might find the faulty component works fine in another system. We're talking about slight instability. However I can tell you a different CPU can run rock solid at stock. If your CPU doesn't run rock solid at stock, don't try to tweak BIOS settings as you should RMA that thing.

The WHEA error is for the GFX card, but when everything thing is failing at once it's hard to say what caused what, until you test things one at a time. You'll find there's mutliple errors.

I doubt it's a Windows issue, I'm on the latest Windows 10 version without issues and no issues on the previous one or 2 either. Also I believe you did a fresh install.

I really feel like most of these issues are hardware issues, probably different hardware but there is a known issue with Ryzen CPUs. BIOS updates will improve it but if the latest update doesn't fix it, I wouldn't play with BIOS settings I'd just replace it.

I'd rule out GFX card and PSU then just swap the CPU.

Good luck I hope you solve it.

Just an update on my WHEA issues, but the warranty replacement 3080 GPU has so far completely fixed the issue even though they originally appeared to be CPU related in the event log.  I'll be back here if it dies again, but just an FYI on where I'm at.

@eliotcole 
- CPU : AMD Ryzen 5900x 
- MB : MSI B550 Tomahawk
- GPU : MSI Geforce RTX 3080 Ti Trio Gaming
- AIO : NZXT Kaken x73 360mm RGB
- RAM : Crucial Ballistix 2x16 Go 3600Mhz Cas16
- PSU : Seasonic Prime PX-850, 850W Plus Platinum 

@Cmdr-ZiN 

Are you 100% sure the WHEA_UNCORRECTABLE_ERROR comes from the GPU ? I've seen a lot of different answers
What are RMA processing time?
Playing with the BIOS meant doing a little tweak with the curve optimizer for me, nothing too fancy and so far it resolved everything. See my previous post above.
My OC and my GPU are stable, i've run 5 Cinebench R23 in  a row. I also have OC my GPU and after 5 Kombustor stresstest still no blue screen or crashes. My CPU at 100% temps around 81-82 max, GPU at 100% around 65-66. I don't see a reason to RMA.

@C64T Thanks for the update, glad you sorted it

@Anzu34 Yeah WHEA-logger error ID 18 is graphics but in my case the graphics system was failing due to the CPU failing, when one thing goes everything goes

Your error is different yours is more general hardware failure, which puts you in the same boat as me and most of us, You just need to identify the offending part and replace it. See this link the advice is reasonable https://www.makeuseof.com/tag/fix-whea-uncorrectable-error-windows-10/amp/

If you're happy with it then that's fine but keep in mind, if it doesn't run stable at stock then something is wrong. A BIOS update might improve it in the future or it may just degrade further when out of warranty, however that's your call.

Seeing as you actually got a bluescreen you might have a slightly different cause, I'd try a different GFX card first, driver reinstall and uninstall of MSI afterburner and reseting all OC setting back to factory. Still can be any of the same things it was for all of us. Good Luck.

@Cmdr-ZiN I'm not sure your issue are solved.  I had no problems running a 3700X + a Vega 56. Upgraded to a 5950X and suddenly WHEA 18 errors every few hours.  The graphics card was affected to the point where Windows would disable it.

I turned off XMP (I also have 3600 ram) and ran it at 1200.  No errors.  Tried manual ram speed tuning and the best I could get was the errors went down to one every 2 days.   I sold off my Vega 56 (going to upgrade in a few months anyway) and threw in my old R9 390P, and turned XMP back on.   I get the WHEA 18 errors now once a week.

I feel like @eliotcole most likely is on the right track.  The graphics card is just a red herring.