cancel
Showing results for 
Search instead for 
Did you mean: 

PC Processors

technovixen
Journeyman III

WHEA: Cache Hierarchy Error Processor Core

I haven't had any issues until today when 2 of these errors happened. One APIC ID 9, the other APIC ID 1. mI was playing the new Marvels Avengers game, which is why I think it is tied to it. But anyways, the game randomly crashed and my entire PC went down. Attached is the files in event viewer. The only thing I did recently is upgrade from 16gb of 2400Mhz ram to 32Gb of 3200Mhz ram. I stress tested the machine and it was fine. I also played all day today and it was fin until I played this game. I am using A-XMP profile 2 to get the ram to run at 3200Mhz.

As of writing this I am testing it again with prime95, passing every test so far, its been going for 45 minutes now. HwInfo during the stress testing does not seem to indicate to me that something is wrong.

Edit:
Sorry I been busy with class and have not go around to replying. I am unable to reproduce the issue at all. Like I said I ran prime95 over night and nothing. I went right to a 3 hour gaming session after that of Marvels Avengers, nothing. It was a one time thing. So I am at a loss on what to do and a tad worried.
I did update the bios to the latest for my motherboard which is MSI B450 Gaming Plus Max. Maybe the bios update fixed it?
0 Likes
69 Replies
mstfbsrn980
Grandmaster

Capture.PNG

The game crashes because your motherboard is not applying the required core voltage to your processor.

Solution:
+ To increase the core voltage manually with the BIOS.
+ Manually lowering the CPU multiplier with the BIOS.
+ Replacing the motherboard with going a computer store.

If you are doing an OC to your processor, you must apply the stable CPU core voltage value according to the VRM quality of your processor and motherboard. If you are not doing, you should also apply it because your system is not stable with the factory settings.

Goodluck...

I'm not doing an over clock at all to the CPU just running A-XMP for the ram. And I just ran prime95 for 7 hours and it passed every test. So I don't see how it is a voltage issue while in game? Especially since when I play other demanding games, it doesn't happen.

If your system BIOS tested with Prime95 activates the TDP limit immediately, the Prime95 test appears successful. This situation may not provide the conclusion that your processor is stable. The frequency of running games is probably higher than the frequency in the Prime95 test and the game crashes.

0 Likes

I have basically the same issue, and I was thinking it was a defective CPU at first.  Then I started thinking it could be an issue with power delivery/voltage droop from the motherboard during heavier AVX/AVX2 workloads.  But if that were the case, then why would this cache hierarchy error be triggered by a game like Marvel's Avengers but somehow not by a torture test like Prime95?  That doesn't seem to make sense, so I'm getting worried that us Event 18 people have defective CPUs.

If it's the motherboard rather than the CPU I'd like to test somehow.  I have a 3950x running in a b550 Taichi (AsRock) with the latest bios.  I wanted to keep it at stock or I would be happier to try your suggestion of upping the voltage/applying an overclock.

0 Likes

Your processor is unlikely to be faulty. I explained above why the motherboard error is not apparent with Prime95.

For now, you don't need the performance you can get by doing OC with AMD Ryzen 3000 series processors.
Reset the BIOS to factory settings. If you follow this way, you will reduce your chances of getting errors.
If you get CPU core error even though you use the BIOS with factory settings, you need to overvoltage.
If this does not solve the problem, you should seek your consumer rights for the motherboard.

0 Likes

How easy is it to duplicate this issue?

If the only WHEA/MCE event you see are fatal errors (crashes or BSOD) and they only happen during Marvel's the Avengers gameplay... There's something really weird at play.

The problem with stock/precision boosted Ryzen 3rd gen is that their voltage control is very complex, they can even compensate whatever offset you do with the actual voltage applied to CCX (I think it's to the right of vCore in Ryzen Master) and as such, what mstfbsrn980 says is true: Prime95 will result in an even all-core clock that should be pretty reasonable (4 Ghz or less) and around 1.3v vCore. Possibly a little less for a 3950x.

The crashes might only occur when a specific core gets turboed too high for its voltage at that specific time. All core loads will never turbo very high.

Or, it's just that The Avengers is doing something really weird that causes some computers to hard crash in a situation where the game should just stop.

You could try running a gimped "no-turbo" setup with fixed manual voltage at 1.3v (1.2v should be fine for 3950x) and multiplier at 36. My guess is that setup won't crash. If it does, I see only the game as a possible culprit.

0 Likes

The problem you are experiencing has nothing to do with the RAMs. And you know that what the voltage value causing the crash in the game. You should find the manual vcore value that you will get a higher vcore than the game vcore value and should give the value with the BIOS. This is what you can do. Good-luck...

0 Likes

In my specific case, just playing with vCore with no other settings doesn't help. One of the cores is slightly unstable when it turboes, that's it.

Only way to fix it would be to set a static, possibly per CCX overclock and make sure the "faulty" core never goes above a certain frequency. I haven't done this though since people say it voids the warranty (I mean I done it with 38 multiplier and no load -> it does fix my problem).

Other poster might have a different issue.

0 Likes

If manual vcore does not solve the problem, changing the VRM settings may solve the problem. You should increase the VRM levels (with increased manual vcore) with the BIOS.

0 Likes

The solution of trying to manually tune or stabilize Ryzen 3000 in the BIOS doesn't exactly help those of us that just want to run the CPU @ stock settings, especially when many users don't have to do so for their CPU to work properly.  If it's something which can't be fixed with a BIOS update and there is nothing physically wrong with our motherboards, then it sure seems like the CPU might be worthy of an RMA.

I'd prefer that not to be the case.  I'm trying to understand the issue so I don't have to RMA either the CPU or motherboard unnecessarily, but I'm definitely still leaning towards the CPU at the moment.

0 Likes

Yes, you are right. But I think a different CPU with the same part number will not solve your problem.

0 Likes

I suppose I have to hope you are wrong, why shouldn't a new CPU solve the problem (even with the same part number) if the issue is a faulty core or problem with the cache?  I have seen numerous forum posts in the last few days where an processor RMA completely resolved the issue, dating back several years and affecting more than just AMD CPUs.  It seems like the mobo expects the CPU to perform a certain way and when it can't it results in this error, so I presume that means the CPU isn't able to perform up to specification.

0 Likes

+The problem might be the motherboard VRM.
+The problem might be the motherboard BIOS.
+The problem might be the PSU. PSU CPU 12v may be losing its stabilization with 12v GPU load.

+Replacing the processor may fix the problem. But after a while, the same problem may also happen to the changed processor. Just like I had to increase the voltage after one month after my new processor was running with very low voltages.

I am sure that the processor you see as broken will work well with a different good system. But does this apply to you? I'm not sure for this. Goodbye...

0 Likes

mstfbsrn980 wrote:

+The problem might be the motherboard VRM.
+The problem might be the motherboard BIOS.
+The problem might be the PSU. PSU CPU 12v may be losing its stabilization with 12v GPU load.

+Replacing the processor may fix the problem. But after a while, the same problem may also happen to the changed processor. Just like I had to increase the voltage after one month after my new processor was running with very low voltages.

 

I am sure that the processor you see as broken will work well with a different good system. But does this apply to you? I'm not sure for this. Goodbye...

Hard to say, nobody posts rig specs

I have a high end PSU, Corsair HX1000i

0 Likes

I would rule out the VRM, especially if you tried all the VRM options such as having all the phases active, upping the load line calibration level, maybe increasing the cutting freq to 500 Khz.

If you say you have one of the higher end B550, it's probably not the VRM.

Now for the PSU: even if it's crappy and has bad 12V stability, the VRM just has to work harder to compensate. But it can and will do that. You can also monitor the voltage drop under load but playing a game on a 3950X is not exactly going overboard with current.

IMO the only motherboard-related possible culprit is the BIOS. You could just try flashing back very old ones and see how that goes.

0 Likes

Please write in your own section. I am writing to you, thinking you have the problem. People wouldn't be having problems if everything was working with your sense. There is no need to disscus with who thinks a new microprocessor is problematic and calls a system RAM failure with a CPU core error.

Write in your own place not me......

0 Likes

I did not assert it was problem with RAM, nor the GPU or PSU.  I just don't think it is the motherboard either, although I am tempted to wait one more BIOS update before starting the RMA (flashing back to older BIOS is an okay suggestion).  It is unlikely to be the VRMs since I have a rather high end board, and I also have a 1200w platinum EVGA PSU which is working fine.  I've also tested my RAM, but once again I didn't think it was an issue with the RAM.

The OP and I seem to have the same problem.  I am still not convinced this is a problem with the motherboard settings or BIOS, will attempt to test with a different chip.  Goodbye.

0 Likes

If you have to use your system with similar errors after RMA, lower the CPU core ratio a little. It will work very stably. This is a ridiculous way for a new and powerful system, but if you can't find a solution, you will have no other choice. Excuse me, I thought you wrote the other person's answers. Good-bye to you... 

0 Likes

Yeah I'd definitely prefer not to overclock this CPU, but I'd like to underclock it even less lol!  Even if I had a cheaper chip I'd feel the same way, it should be able to maintain stock speeds at stock voltages.  I still hope to try my chip in a different mobo and/or a different chip in my mobo (and maybe wait for another BIOS update) before starting any RMA.

I'd just like to thank you mstfbsrn980 for trying to recommend potential solutions, or at least workarounds   I've read a lot of forum posts about similar issues in these last few days, and your recommendations are pretty consistent with most of the good advice I've seen on the subject.

mstfbsrn980 wrote:

If manual vcore does not solve the problem, changing the VRM settings may solve the problem. You should increase the VRM levels (with increased manual vcore) with the BIOS.

The BIOS defaults are suitable for ALL Ryzen processors

You made a correct determination, except for exceptions. There are a limited number of users experiencing problems like it happened to me before. With the lowest quality Ryzen motherboard, you cannot use the highest quality Ryzen processor duo without configuring the BIOS. So... You write short with a single sentence, but I think this is not true because it is a single sentence. I also wrote this answer to the person I thought had the problem. So I wrote it wrong to the wrong person.

0 Likes

that is why i have an x570

0 Likes

I missed your reply here but you asked a really good question so I wanted to answer.

"How easy is it to duplicate this issue?"

For me, it isn't easy to duplicate this issue as it just seems random, and it isn't only in Marvel's Avengers (but it is easiest to provoke with that title for me).  It's a pretty new game that apparently has some Intel-specific optimizations, so encountering issues with that application most frequently led me to do some more rigorous stability testing with what I hoped were the relevant instruction sets in Prime95.  The issues don't persist in Prime95 though, and if it were because the game supports AVX-512 or something then you'd think all Ryzen chips would have similar issues but they don't.  Before installing this game, problems happened very infrequently and seemingly at random, still doesn't happen very often at all.  I had probably seen the issue in Blender most often before, which I believe uses AVX2 instructions.  Of course I can understand if they are still working out some kinks when it comes to supporting some formerly Intel-specific instruction sets, I just want to make sure my CPU is alright.

Being that I am gaming at all, I certainly don't want to run a gimped setup if I don't have to lol!  I get what you're saying about doing it to diagnose, but I've already diagnosed more than any end-user should have to.  I'm sure you wouldn't call gimping the setup a fix either, no one should have to run at below stock clocks on a pristine CPU as a workaround.  My plan is to see if the next BIOS update will fix it, and I'm hopeful since it's still a pretty recently released chipset. I'd also like to get my hands on a different motherboard (just in case) to see if the issue persists that way, but if everything else fails I will probably start an RMA for the CPU.  Since the OP had a basically identical issue from what I gathered, I'd suggest that they try to RMA their CPU too if this BIOS update doesn't end up fixing the issue.  Their most recent edit says they've had no issues since updating their BIOS though, so even though it was a really intermittent issue that is some reason to be hopeful

0 Likes
robertbruce
Journeyman III

Are you getting any non-fatal cache hierarchy errors? Those are mentioned as "corrected" and classified as Warnings in the Windows "System" event log.

Also, can you give us the exact model of RAM you're using?

0 Likes

It says "A fatal hardware error has occurred" right there in the .evtx file she uploaded.

I have basically the same issue and my RAM is G Skill Flare X 3200 mhz @ cl14.

0 Likes

So you don't have the warning "corrected" WHEA 19 events, you only get the errors, which cause system crash.

I guess that's worse than what I have although I don't know what my warnings could evolve to.

From what I understand, these errors happen when the unstable core is turboed to a frequency it just can't handle at that voltage. Which is why it doesn't happen in stability tests which peg all cores to a turbo much slower than what a 1-4 core loads will ask for.

I initially thought it only happened at idle because of that. But no, if the load applies a high turbo to the faulty core, errors will pop in the logs.

That's my current theory anyway, it seems erratic. I tried every single memory configurations including a single stick of slow DDR4 that is on my motherboard QVL. So I'm pretty sure these cache errors are not linked to memory.

I'm slowly thinking this just warrants RMAing the CPU.

0 Likes

I have the scarier sounding fatal errors (event 18) like the OP, but both the issues seem indicative of at least a possible hardware failure.

Mstfbsrn980's response gives me hope that it could just be an issue with the motherboard.  I thought like they said that it might be a power delivery issue, but it could just as well be defective cores.  I don't want to RMA my CPU either, but I've found a lot of forum posts where that was the ultimate resolution to the issue.

0 Likes

Part of the problem is using a B450 instead of an X570 board for a Ryzen 3000 series CPU. I use a lower cost R5 3600 on an X570 an I do tons of brutal work on the machine. The nature varies from day to day.

I use my RTX 2080 for decoding and encoding video as the card has a lot of capability that beats my CPU at.

I use Avid quite a bit which is also demanding when dealing with larger pieces.

0 Likes

I'm using a b550 board, not an b450 or x470 board, yet the issue is the same.  It's a very high end b550 board too, or I'd want to just trust msftbbsrn980 about it being the board rather than the CPU.

0 Likes

sideshowbob wrote:

I'm using a b550 board, not an b450 or x470 board, yet the issue is the same.  It's a very high end b550 board too, or I'd want to just trust msftbbsrn980 about it being the board rather than the CPU.

B550 needs a 3000 series CPU 

I need full hardware specs

0 Likes

I wasn't the OP but I have what seems to be an identical issue.  My system specs are as follows:

CPU: R9 3950x

GPU: Sapphire Nitro+ 5700 XT

Mobo: B550 Taichi (AsRock)

RAM: G.Skill Flare X 3200/c14

PSU: EVGA 1200w P2

0 Likes

For the R9 3950X I would be using an X570 motherboard which is far more suitable for the top CPU.

0 Likes

You might assume so, but if you compare the X570 Taichi vs B550 Taichi, you'll see that the B550 actually has superior power delivery/VRMs.  This proves true across brands (see the Asus Strix F, Gigabyte Aorus Master & MSI Gaming Pro Carbon for example), the main advantage of X570 over B550 is all Gen4 PCIe through the [usually] actively cooled chipset.

0 Likes

I have a MSI X570-A PRO and it works with lots of processors I have tried on it.

MSI boards seem to be less flakey than some others I have used in the past

0 Likes

I do like MSI even considering all the grief they've been getting from tech-youtubers recently, mainly because they handled an RMA for a GPU with a clicky fan very quickly for me.

I don't doubt the quality of X570 boards at all.  But if your only goal was to "futureproof" for the 4950X/5950X, which will presumably be a bit more power hungry, then you'd actually be better off with a B550 board over an X570 board.  If the model name is identical, then the B550 variant always has better VRMs.  They were just engineered later and therefore better (much better in this case), even though it's supposed to be the "cheap" chipset.

B550 VRM Tier List: B550 VRM DB sheet (Ver 1.5) - Google Sheets 

X570 VRM Tier List: AM4 Vcore VRM Ratings v1.4 (2019-11-07) - Google Sheets (goes back to X370/B350)

0 Likes

BIOS problems were a big headache but the latest one seems to be less problematic

0 Likes

I guess the chipset adjusts the CPU voltage. I will repeat that I wrote the first. It is unnecessary to look for problems with the CPU.

0 Likes
mstfbsrn980
Grandmaster

For the specs...

CPU: R9 3950x

GPU: Sapphire Nitro+ 5700 XT

Mobo: B550 Taichi (AsRock)

RAM: G.Skill Flare X 3200/c14

PSU: EVGA 1200w P2

Capture.PNG

Capture2.PNG

Give 43 multiplier to the CPU core ratio with the BIOS.
Give 1.45 manual voltage to the CPU core voltage with the BIOS.

Your BIOS may be trying to turbo your processor more than the normal. This could be due to any changes you made with the BIOS. It is necessary to examine in great detail.

 


0 Likes

Edit: You cannot go above 4.7Ghz with this processor. When you try to run all the cores at the same time with 4.7Ghz, it can consume 200 watts of power. If it does not work stably in the factory settings, it is necessary to edit it manually and give the hard TDP limit according to the cooler quality. This processor appeals to professional users and I think it is best to try to specialize on it.

0 Likes