I purchased a brand new Ryzen 7 3700x and successfully installed it in a new build. From the first day, I would get BSOD with code WHEA_UNCORRECTABLE_ERROR, and have been checking all hardware parts to find the issue. It turns out the processor was the issue as I switched it out with a Ryzen 5, and had no errors after. Drivers have been checked. The hardware has been checked and looks brand new. Is this something covered by the warranty? Also how long does that take to receive the replacement?
If you think it is defective, you should seek your consumer rights. However, if I were for you, I would reset the BIOS to factory settings and set the CPU Core Ratio value to 35. In this way, you will underclock the CPU and test whether the processor is stable. If stabilization is provided, the appling of core voltage of the motherboard is not correct. And this problem may happen to you again in the future.
You are not alone many have had this issue and a good chunk of them find it is CPU related. If you are still in your return period then see if you can exchange it with the retailer.
I think a while back when I had a cpu rma'd with AMD it took about 4-6 weeks.
I would never tweak anything away from defaults to get a new CPU stable. If you have to do that then you have a board, cpu, memory or power issue. If it isn't running right at defaults I would question the longevity of the chip under any altered settings.
No reason to do that when you have the full right to get another CPU.
If you need the RMA form link here it is: https://www.amd.com/en/support/kb/warranty-information/rma-form
OP feel free to follow whatever advice you like. I would suggest talking with the support department of your motherboard and take their professional opinion. Don't get sucked down the rabbit hole of listening to users who would have you changing everything under the sun in an attempt to get it to work. Sure if you cripple your performance of the cpu, you may get it more stable. That is possible. Sure if you never use PBO or XMP it may be more stable. To me I wan't my stuff to work at defaults as advertised and when buying new, there is no reason to settle for less than that.
You can also talk to AMD support about this too. They VERY often suggest RMA on this issue of the CPU so no it is not a slim chance. https://www.amd.com/en/support/contact-email-form
Do not attempt any RMA operation on your hardware without running any tests. There are also many users on these forum pages whose results do not change despite the CPU-RMA. You need to find the cause of the problem. In addition, the probability of a CPU being produced faulty is extremely slim.
No it is not slim and is the most likely culprit this is not opinion but based on tons of threads at this point of users having this same issue. You however don't agree with any of them even when they tell you it is the CPU.
------------------------------------------------------------
That is from Intel docs...
How to fix it:
The error WHEA_UNCORRECTABLE_ERROR means that your system shut down to protect itself from data loss,. The error happened due to a hardware issue.
To fix it, try the following:
Get the latest updates with Windows Update. Go to Settings --> Update & security --> Windows Update, and then select Check for updates. Reboot system.
If that does not work, try the followings:
Set the BIOS to its default values. Reboot system.
Start Windows in Safe Mode, and check if there are any driver errors in the device manager.
Check if there are any hardware issues.
Try to boot with minimal hardware settings (for example, one DRAM, less hardware components on your system).
Try the Windows 10 app called Windows Memory Diagnostic to check for memory issues.
------------------------------------------------------------
That is from Easeus...
What Are the Causes of the Whea Uncorrectable Error
What causes the Whea uncorrectable error? The Whea uncorrectable error is one type of BOSD error. This error is usually caused by five reasons:
BCD error
MBR or system file error
Driver issue of hardware
Faulty hard drive, or Processor
Low CPU voltage or overheating CPU
------------------------------------------------------------
System components, frequency of error message you got, and what caused this error are unknown. And while a low setting of a motherboard BIOS is very likely to cause this error, changing the CPU probably won't fix the problem.
If your processor is working properly at low speed, it will work very comfortably at the factory settings with a suitable BIOS setting.
OP you may find these threads helpful. They are full of people with your error message. Many different possibilities of the cause are discussed and things they tried.
Those thread links:
This reddit thread has many users involved where replacing the CPU seems to fix the issue.
3800X random reboots with WHEA-Logger event ID 18 : AMDHelp
another
Each processor is produced like baking a cake on a tray. Processors with good silicon quality are introduced to the market with the phrase x or k. And it is very unlikely that the silicon quality of these processors is low. And these processors are tested with a proper voltage and released to the market. In particular, WHEA errors are caused by the fact that the processor's voltage curve showing what speed and what voltage is applied is not smooth. Corrupted RAM may cause. Corrupted disk may cause.
The presence of people mentioned on other forum pages who think that the CPUs are produced malfunction does not make this situation so easy to go to CPU failure without knowing anything.
Again follow any advice you like.
Yes it can be caused by lots of things. However in the past quarter there has been a big uptick with this issue. Most of them having new systems. So those people that worked with their motherboard manufacturers support departments and AMD support who suggested the CPU be changed must all be wrong and you are right.
There is no doubt it is a vague error that can be lots of things including but not limited to malware, file system corruption, memory issues, power issues, mother board issues, gpu issue and cpu issues.
However lately there have been a lot of cpu issues across the tech forums on the net. With far more often than not the cpu being the culprit. It doesn't hurt to check all things.
Now he picks on the links for reference I supply the OP not to him. I didn't even participate in any of those threads. Yet anything I touch he says is wrong.
OP please understand he has an issue with me and everything I say. Although not just me he tells others they are wrong all the time too. Like I said don't take my word. Talk to the support departments for your hardware. They are the professionals and I am quite sure they will be happy to help you with this issue.
If you want to see which one of us in these forums offers users the most constructive help I suggest you look at our profiles and at his customer awarded badges of zero and I am in 9th place with most badges out of thousands of users on this site and his expert badges or 2 compared to my 59.
I am not saying any of this to brag. Just getting really sick of him picking a fight with only his advice being the plausible good advice and rarely and I mean rarely when you go back through the threads this user is involved in do users come back saying his advice was correct.
I'm not fighting. You are the one who fights with yourself and shouts everywhere like my little boy.
You are the one who cannot tolerate even a small test suggestion to see the state of the system and cannot fix such an error. I can stabilize the system on which OP gets errors, even if it is 200Mhz OC with a bit BIOS changes. The biggest problem for AMD is that the factory settings are not working properly. It is also a mistake to make precise judgments about the system you do not know, instead of showing computer users how to use the systems they use more professionally. Don't do literature IN MY LANGUAGE. If you had looked in the mirror until you taught others to train, you would have become a professional in this computer world.
You say this yet I have never once called anything you said wrong. Every time we get into this it is you that starts this by badgering what I had to say. You are the one who constantly calls others advice wrong. The OP can follow whatever advice they like.
Dude you have some serious issues and I feel sorry you feel that way about your little boy. I would never talk about my kid like that in a public forum even if I did have an issue with him and I don't. My kid since he could talk is mild mannered and rational. All I can say if your kid yells where do you think he gets that? Children are products of their environment.
I suggested the customer talk to the support departments of their hardwares maker. Not sure how that can be wrong.
You still think I'm fighting with you. I'm not fighting with my little boy like you! You thought I did. You think the CPU is corrupt, but it's not! I don't love you just because you look like my little boy. Unfortunately, this is the only difference between you and my little.
I have no idea if the CPU is corrupt. It certainly could be or it could be something else. Just based on what has been happening lately with them I would think it is a decent chance.
The only one certain of their being right is you as you just proved once again saying it's NOT. You don't know more than I or anyone else if it is or isn't. Are you physic? How wold you know what I look like we have only written each other? I certainly look far from a little boy? Wow you have ridiculous arguments. And how are they helping the OP exactly?
pokester says...
I have no idea if the CPU is corrupt. It certainly could be or it could be something else. Just based on what has been happening lately with them I would think it is a decent chance.
The only one certain of their being right is you as you just proved once again saying it's NOT. You don't know more than I or anyone else if it is or isn't. Are you physic? How wold you know what I look like we have only written each other? I certainly look far from a little boy? Wow you have ridiculous arguments. And how are they helping the OP exactly?
I'm not really fighting, but you're starting to bother me. Instead of finding the source and the solution of the problem, you make comments like expert. You are the main person who suggests the consumer rights regarding the CPU without knowing any think and does not help the subject owner.
People who got this error message can try the following ways...
+Decreasing the processor multiplier.
This change makes the processors work more stable. Reducing the speed of a microprocessor provides a gain in temperature and stabilization.
+Increasing the processor voltage.
Increasing the CPU voltage adds OC potential to the systems and this adds stabilization to the system when there is no speed change.
+Increasing the level of the motherboard VRM.
Increasing the VRM level provides stabilization by decreasing the voltage curve.
+Changing the processor TDP with the BIOS.
This error message can occur when the motherboard tries to reduce the CPU speed when the processor reaches a level close to the TDP limit. For a similar reason, increasing the GPU power limit worked well for you. But you do not even know the reason for this and you recommend the same OC method for every GPU problem.
+Disabling CPU C-State feature with the BIOS.
Some PSUs cannot stabilize the systems when the CPUs are tried to operate with very low core voltages. So this feature allows the systems to consume less power, but may decrease stabilization.
... ... ...
The motherboard manufacturers' RMA recommendation for the CPUs also does not indicate that the CPUs are bad. It does not show that the hardwares they produce are also broken. Each processor and VRM loses its lifespan and stabilization as it runs and gets hot. When a processor stops working stable, this does not indicate that the processor is broken. Every system user may receive this error message briefly depending on the CPU age and the CPU silicon quality.
Yes they most certainly can try any of the things you suggest. I never said they couldn't.
"The motherboard manufacturers' RMA recommendation for the CPUs also does not indicate that the CPUs are bad. It does not show that the hardwares they produce are also broken. Each processor and VRM loses its lifespan and stabilization as it runs and gets hot. When a processor stops working stable, this does not indicate that the processor is broken. Every system user may receive this error message briefly depending on the CPU age and the CPU silicon quality."
Care to share where you got this information? Surely you have a link? Plus the only motherboard makers recommendation that would be relevant here is the one that made the OP's motherboard and I don't think they ever gave the maker. I know of no governing body that oversees motherboard makers and gives recommendations. I did several searches and could not find one. Not saying there isn't one but I couldn't find it and neither could google, bing and duck duck go.
Absolutely many issues can present in ways especially power related making a good CPU seem bad. It however can be a bad CPU too. This is the point lost on you and that you just won't concede.
I would not recommend that anyone operate a new system at anything other than defaults until it runs right at defaults.
Operating not at defaults depending on what you change, often invalidates processor and or mother board warranty.
If it isn't working right at defaults and is new, I would exchange the product with the retailer or RMA the product, not ruin my right to do so. Or I would at least first talk with the support departments before doing so of the motherboard maker, and cpu maker. You however seem to have issue with that too and following your advice is the only course.
There is no point arguing with you and that is what you are continuing to do. You have your view and zero ability to see any other opinion no matter how much common sense it makes.
Piece of cake with high silicon quality are used for high model processors, while piece of cake with lower silicon quality are released as lower models. This is why AMD is releasing R3s late. So the microprocessor manufacturers release the high models first and the low models later. But the fact that X-labeled processors crash more frequently made me the impression that the motherboard BIOSs kept the processors at higher speeds with higher instantaneous time and therefore increased the possibility of crashes. It is very difficult to say more without detailed examination and long-term use. These error messages say that the CPUs might be produced with OC. You may say that AMD produces OC CPUs. But it doesn't deserve more.
Edit: It is a bit of luck to buy a processor with very good silicon quality. But microprocessor manufacturers take this chance into account. So each processor's OC potential is different. This is a bit of luck.
Edit: Some people on this forum site shared that a choice of XMP profile increases processor voltages. This creates the impression that some motherboards are likely to apply lower core voltages according to some settings or something, and the CPU-RAM controls are not working properly. So to say the CPUs are faulty, it's easy and very funny. I can clearly write to you that I can run the CPU that you said corrupted with at least 200Mhz overclock. I will not write anything else.
I agree there are some boards that when enabling Overclocks, XMP and or PBO are showing voltage other than reference spec calls for. This has been documented many times over by several tech sites. Many of those that were like that though have been fixed by bios updates in the past couple years. No idea if they are all fixed or not. I am far from an expert on this. This is why I rely on the support advice of the companies that make the products. If they want to RMA the products I am in no place to argue with them. I would hope they know their products better than I do.