cancel
Showing results for 
Search instead for 
Did you mean: 

Processors

jselph17
Journeyman III

WHEA Errors on Ryzen 5900X

Hello, all.

I just built my computer after waiting six months for a 5900X to arrive. I've stress tested the memory, CPU, and GPU. None have reported errors. However, when I play Doom Eternal, for example, I get a BSOD after about an hour or so. This only happens during games. I've noticed that the WHEA errors seem to be common for the Ryzen 5000 series. Has anyone successfully fixed this issue? If so, what have you done to solve it?

I can RMA the processor if I must, but I'd rather not have to.

0 Likes
14 Replies
SPQR
Journeyman III

While it's difficult to be certain, this sounds like the same problem being experienced by a significant proportion of Ryzen 5000 users. The catastrophic instability exhibited by these CPUs at stock settings doesn't seem to occur much in artificial stress tests, manifesting instead during gaming workloads, general browsing or while the computer is idle. My 5900X crashed while I was on YouTube a couple of hours ago, and was crashing repeatedly while I was playing Doom Eternal. My WHEA errors are almost all confined to the same two APIC IDs, which fits the pattern of this problem. 

So far, the only successful "fix" is to hamper your CPU performance by disabling stock features that are designed to be part of the normal operation of these CPUs (disable CPB and global C states in BIOS...or add a positive offset to some/all of the curve optimiser values). In other words, this isn't a fix and the problem seems terminal. If it doesn't work at stock settings right out the box and needs its performance to be limited below the standard advertised performance, then it's broken and needs to be discarded and replaced. 

 

Some people have RMA'd their broken CPUs and received replacement CPUs which are stable. But there's no guarantee this will be the same for every case. I'm in the process of RMAing my broken 5900X and playing the hope strategy. As for you, your options are: keep your faulty CPU and put up with the instability; adjust BIOS settings to restrict CPU performance to achieve stability; RMA and wait; switch to Intel. That last one might deserve some serious consideration. Say what you will about Intel and their overpriced, inefficient, low core count CPUs, but at least they test their products instead of releasing them in a broken state. My Intel CPU has never crashed, not even once. 

 

What an absolute mess this CPU release has been from AMD. There's no way they didn't know about this common issue (it took me less than a day to encounter this issue for the first time), so I'll never know why these CPUs were authorised to be released in this state. It's frightening to think they allowed people to buy such expensive products knowing there's a significant defect rate. 

 

That's it for this rant.

__________________________________________________________

 

I just built my computer after waiting six months for a 5900X to arrive. I've stress tested the memory, CPU, and GPU. None have reported errors. However, when I play Doom Eternal, for example, I get a BSOD after about an hour or so. This only happens during games. I've noticed that the WHEA errors seem to be common for the Ryzen 5000 series. Has anyone successfully fixed this issue? If so, what have you done to solve it?

 

I can RMA the processor if I must, but I'd rather not have to.

0 Likes

Thank you very much for the response. Yes, it is incredibly frustrating and disappointing. Like you, I can't imagine this error didn't come up in testing. All it takes for me is to leave a game up for less than an hour and I get a blue screen. Yesterday I left Doom Eternal on the menu screen and glanced at it every once in a while when I was eating. It took about fifteen minutes before my computer crashed with a blue screen.

I feel like a BIOS update is the most one should be expected to do to have their CPU perform at stock settings. Anything more and the CPU is defective. I'm going to try and return it for a refund from the place I bought it from and get another one. Let's hope that one works. If not, I'm going with Intel.

Do you mind letting me know how your new CPU performs? I'd really like to know if getting a new one fixes your issue.

Thanks!

0 Likes

Its worth nothing that many of the posts and myself who have had these Whea (typically event 18) errors with APIC IDs have had no luck with changing any settings in the bios: CPB, PBO, XMP....none of it changes anything.  In many cases, like my own, I am able to swap the GPU to a different one and will not see these issues.  Though turning off windows Fast Boot within Windows seems to have solved the issue for many people.

So, it seems the errors may or may not have anything to do with your CPU.  I am on a 3700x and with my 5700 xt GPU, I get no errors at all and can happily game for hours on end.  When I finally got a 6800 xt, I could not game at all and would constantly get WHEA event 18's Cache Hierarchy with some random APIC ID.  Going back to the 5700xt, the problem was gone.  I did lots of tests with bios settings, configs, drivers, the whole shebang.  Eventually I narrowed it down to my new GPU and I sent my 6800 xt in on a RMA and Gigabyte found multiple issues with the GPU.  

I have read more posts that the issue was the GPU than I have that the issue was the CPU.  But, I do recall a few who RMA'd the CPU and claimed it fixed their issue.  The biggest fix I have read about that was not hardware related was turn off Fast Boot in windows.

Seems like a problem ripe for review for AMD and they need to figure out how to more aptly report what is going on with AMD systems that produces these errors from seemingly wildly different sources.  

Some even claim that changing their RAM or changing their PSU solves this.  Its a crap shoot.  At least in my case I had enough systems to more or less reasonably assume the problem was the GPU.

0 Likes

I would add that I have a 3700x and an Nvidia 2080 and I am still getting the whea failures. I've even rma'd my power supply with no success.

0 Likes

It's been a week since I encountered these errors. I ended up sending the 5900X back for a refund and purchasing another from a different company. I haven't had a single blue screen in the week that I've had my new 5900X, which tells me it was the processor. Glad to finally have this fixed.

0 Likes
3DJF
Adept I

Have had constant WHEA errors since my machine arrived after i waited 17 WEEKS for the system to have parts. After weeks of testing and going through EVEERYTHING with the system builders i had to find out in a f(*&ing forum that there is a fundamental flaw in CBP and PBO with 5000 chips. 

 

I am NEVER buying another single f(*&ing peice of AMD **bleep**e for the rest of my f*&ing life.

 

bye AMD you c^%ts

0 Likes
trek
Elite

What you describe is typical memory issue, because you have BSOD after longer time of gameplay. Gameplay is the best stress test of computer components, not individual component stress testing.

Issue is that memory is warming up during gameplay and - depending on memory die type - memory can malfunction. Some modules can malfunction already at 50C.

It could be also other component so you will have to provide additional cooling or put memory at low clock and voltage and try again.

If problem persists you will have to replace individual components and try - memory, graphics card, reseat cpu, motherboard, etc.

0 Likes

I don’t think this is a memory issue as I’ve tried different ram kits and experienced the same BSOD. Everything has been reseated. Thanks. 

0 Likes

It is memory, I do not know what GPU you have and what temperatures you have.

If you have a powerful card which blows inside the case, it is grilling memory modules. Open your case point some additional fans on memory modules and it is gonna be nice, try some games and report back.

 

0 Likes

It's been almost a week since I encountered these errors. The problem was not memory. I refunded my 5900X and bought a new 5900X, which has since fixed the issue. I did not add any cooling or change anything in the system aside from the new processor. Again, the issue was not memory. Thanks. 

0 Likes

Great stuff man, I'm happy you managed to sort the problem out quickly. Something good came out of it, because we've now got even more evidence supporting this known issue (not that we needed it), and you ended up with a working product in the end. I really hope your retailer disposes of the defective CPU and doesn't sell it on to another customer, as the issue is quite technical and a store manager or retail staff may not understand or appreciate its significance. 

We also learned very important lessons about how CPUs can cause catastrophic instability through their voltage/frequency behaviour, and how AMD either doesn't test their CPUs or intentionally sells an unacceptably high quantity of defective units. I'm curious to know whether you'll be purchasing an AMD processor again at any point in the future! It might take Intel a few years to be competitive again in the CPU market so I hope they'll have something nice for us in a few years time.

0 Likes

I just hope the place I bought it from refunds me. They should. I also told them to test it with games because I could stress test it and the CPU worked fine. It was only when gaming that it would blue screen my PC. I hope this is the last of the issues with this build. I played Stellaris on the defective processor for around a week and a half and got no errors. I hope that's because the game just isn't that stressful and not indicative of the processor failing over time because I don't want to have to worry about this processor failing in a week or so.

Hmm, and that's a good question. I was really supportive of AMD (who doesn't love the underdog?) because they've been creating nice processors, but this specific instance really turned me off from them. I would say that if Intel can come out with a competitive product (many cores, frequency, pricing, thermals, etc.) I would probably go with Intel due to this instance and also I don't hear about many issues from their processors. I didn't get Zen 2, but I've heard this was a problem with that line of processors as well.

In the end, I'm just disappointed these processors are getting out to people. If AMD would come out and issue some type of release/commentary acknowledging the voltage issue and being willing to replace the defective processors of people who are suffering from this issue, I would feel a lot better about it. I know errors and defects occur because no product is perfect, but come on... An hour of playing Doom Eternal blue screened me? Seems like that could have been caught in a test. Oh, well. I just hope my current 5900X doesn't manifest the same problem in the future.

I'm interested to see what Intel can come out with in the coming years.

0 Likes

I hope your 5900x stays solid.  

If you do start seeing the issue again, certainly post back.

In my case, I am still waiting on gigabyte to send my new 6800xt GPU.  If I still have the problem, even though gigabyte confirmed my card was in fact bad, I will be going for a CPU RMA next on my 3700x.

There is a difference in our issues though, yours issues were occurring after time of being used (I think you said around an hour), mine was consistently random and would happen within 1 minute to 10 minutes every time, even in bios.  As well, with my 5700xt GPU, this does not happen at all and I could game for hours on end.  

I think mine likely is the GPU, guess I will know in the next few weeks.

Thanks for updating your thread.

0 Likes