Hi Everyone. I am having WHEA-Logger Event ID 18 (By component: Processor core, Error Source: Machine Check Exception, Error Type: Cache Hierarchy Error)
Is this a hardware failure of some sort or just windows software/drivers issue? How do I identify what part is causing problems. I don't have any spare PC to test into.
Apparently this is happening a lot to people. Searching the web you will find lots of different fixes. However most the time on the ones I have seen where it is a ryzen cpu with a nvidia gpu the users report back saying it ended up being a cpu replacement that fixed it. That doesn't mean it is your issue.
I found a couple good threads that have many people talking about the issue and what they have tried. Maybe it will help you.
Your mother board support department can also be a good tool for helping isolate what they think the issue might be.
Those thread links:
This reddit thread has many users involved where replacing the CPU seems to fix the issue.
3800X random reboots with WHEA-Logger event ID 18 : AMDHelp
another
Yes. I had seen many issues too over reddit as well. So should I RMA my CPU? Also is this causing the prime95 issues as well? I conctacted AMD Tech support and they too told me to RMA the CPU. Should I proceed or try to rule out any other part? I have done memtest and also OCCT Stability test. Memtest cleared with 4 passes with XMP On and OCCT Stability test of over an hour gave no errors.
I think you have done all you can. While in general these issues can be caused by a host of different issues non cpu related. There has been what I would call an abnormal amount of these issues being caused by the cpus as of late. So if AMD is willing to RMA, I would let them. They likely would not offer if they too didn't think it is a good possibility it is the issue. So I would do it and go from there.
Try the RAMs with 3200. Restart, freeze and WHEA errors have different causes. Your RAMs may be causing the freezing problem (and WHEA error). Obviously you are having a restart problem, if this problem is evident on the desktop, return the motherboard immediately.
i agree with mstfbsrn980 ... on ryzen oc ram over 3200 seem to raise this kind of issues (the restart without bsod , i was having the same problem with 4400 sticks only running at 3600 , the problem is not the sticks it's the oc applied to the system when going beyond 3200)
Which again is why it could point to a CPU issue. The memory controller is in the CPU.
Again it is not really possible to know for sure what is causing the issue as it is such a vague error that many things can trigger it.
I
Yeah. For the first time, I fully agree with you. goodday...
It is really about helping the OP and not about once in your mind me being right. Read back through any number of the augments you start in numerous threads claiming others are wrong and you will see the OP's consider my advice "right" far more than once. Nobody is wrong, it is why many are allowed to reply in USER to USER forums. You should read back through those threads where you disagree and how often it is NOT your advice or mine that helped the OP. None of this is about one of us being right. I could be completely wrong this time. It is just a logical place to start in getting the OP on track to using their computer not diagnosing it.
You don't need to be too aggressive. I have never had a thought of disrespecting you. Think of a system. It freezes and has a restart problem. And there is a WHEA error. Not worth the worry.
Note: My aim on this site is not disrespect.
I don't know if maybe you don't speak English as a first language and that is your excuse and sorry if that is the case but you need to do better. What you said is insulting and can be taken no other way at least not in English. When you tell someone that for once they are right it is the exact same thing as saying you are always wrong. It is just doing it with smile which you also tacked on.
You are one of those that starts a conversation with , I don't mean to offend but goes on to offend the heck out of people.
I don't speak to you ever like that. I don't even respond to your statements unless you start it. I answer the OP which I fully respect you and I have the right to do and they can follow whatever advice they like.
What I find however most ironic is that your are agreeing with me here yet in a thread a month ago of the same topic with my same advice you went on an on and on about how wrong I was, completely missing the OP had already followed the advice and fixed the problem. It was as pointless a conversation as this has become now.
my apologies to the OP. You shouldn't have to read this junk spilled into your thread.
The language I use is not English.
I have spoken a language that is the equivalent of my own language, called "affection". However, as always, you got me wrong. I apologize to OP, not you!
Amazing as I said you say your intention is not to insult but you continue to escalate and insult worse. I explained why what you said is insulting and you make it worse. Nothing affectionate about how you write anything. So instead of understanding your grasp of what you think you are saying in English and are not is wrong, you just insult more.
This is not an insult in my language. I wrote you what was. I am apologizing if you are misunderstanding. This is for a funny situation in my language and it is a cynical but "un"insulting approach to a situation to be "un"resolved.
So ... The system both freezes. It is both restarting... And it gives WHEA errors. I laughed because it sounded funny. I had no intention of making fun of you. I found what you wrote correct. That's it...
ryzen recommanded ram speed : 3200 mhz ... nothing more to say ... the rest is about overclocking and your understanding or not of overclocing
3200 while supported is overclocking too. Any time you have to enable XMP or DOCP to get the ram speed it is considered overclocking the memory controller in the CPU. Yes the more you do so may add to the instability. Regardless the OP wants to try something to move forward and not settle for less without trying some things. Since the CPU has been proven to also cause this error and AMD has conversed with the OP and wants to RMA it. I don't see it hurts as long as the OP doesn't mind. If it doesn't help on to plan b.
My suggestion is because I saw a WHEA error. Not more.
Also the prime95 fatal hardware errors that I am getting on worker #13 & #14 could indicate CPU degradation?
Apart from that it seems like all my other parts are working as usual except for my PSU. The PSU has some sort of arcing sound that comes and goes. It doesn't sound like coil whine but electrical arcing. I suspect my PSU may sooner or later be on its way out. But so far it still seems to be delivering stable power (monitored via OCCT). I might go ahead and RMA that as well along with CPU. The system seems stable for the most part except while playing really heavy games like RDR2 and the likes, this freeze reboot issue with WHEA 18 occurs. It was a rare occurrence before but its becoming fairly frequent now.
If you think your PSU is bad then you should replace that before doing anything. The CPU is the brain but the PSU is the heart. Bad power delivery could be making the CPU act bad or could have even caused failure. It is fine if you still want to send the CPU in for RMA but I would replace that too and definitely before putting another new cpu in that system. Good power is the most important part of the system.
Ok so I just tried to do OCCT Powerdraw test. It tries to generate max cpu and gpu load to test motherboard and powersupply. It deteced tons of errors and the pc instantly went to BSOD. I have a minidump. Attaching a screenshot below.
What could this mean?
+It is very possible that the RAMs are causing the freezing problem you are experiencing.
+It is very possible that the motherboard are causing the prime95 problem you are experiencing.
+It is very possible that the PSU are causing the restart problem you are experiencing.
+Don't try to fix the RAM problem. Reset the BIOS to factory settings. And make sure the RAMs are installed in the right places with the mobo manual.
+To solve the Prime95 problem, you need to increase the voltage of the CPU core. This problem will most likely be solved if you manually give this voltage value by asking or by searching with Google.
+The PSU or mobo is causing the restart problem. If you can do the OCCT test, it makes more sense to look for the problem on the motherboard. Is the VRM getting too hot for the mobo?
I am writing again. Your system has multiple errors. New CPU+RAM+MOBO is required to resolve this kind of unified multiple errors. And I found this situation a little odd... Goodbye...
Ok so I just tried to do OCCT Powerdraw test. It tries to generate max cpu and gpu load to test motherboard and powersupply. It deteced tons of errors and the pc instantly went to BSOD. I have a minidump. Attaching a screenshot below.
What could this mean?
If the OCCT power test causes a problem, there may be a problem with the PSU. You can buy a good and new PSU (750w or better) with gold certification. If the motherboard's CPU power inputs are 8+4PIN, the PSU should be suited for this. But here's the problem... A new PSU might not solve all your problems. I have given you the solutions. If you try the ways I mentioned, you can find the source of the problem and fix it.
I already have a Corsair RM750 2019. Is that not good? Was thinking to RMA That.
Yeah. Indicate that you got OCCT power test errors for RMA.
Edit: Most of those who have PSU problems and those on this site use Corsair. It's very poor quality.
Ok. I will proceed to RMA that as well. But OCCT also mentions that it tests the motherboard as well. Could it be motherboard? Also I got errors like Pshed.dll, bootvid.dll afunix.sys win32k.sys win32kfull.sys ntoskrnl.sys in minidump. Could this mean motherboard issues?
It can be. If I were you, I would have the parts of the system tested. You can do this for a low cost by going to a good computer repair center. You can check the CPU and RAM on a different motherboard.
You absolutely can have a bad motherboard but it is the most least likely. Fix the things you know are wrong first. Replace the power supply. You can likely get one retail even where you are. Then RMA the cpu and go from there. Those are the obvious things. The bad PSU will make everything seem like it is bad because they all count on proper power.
Edit: Don't buy Corsair. Cooler Master, Thermaltake, FSP, EVGA... whatever is. Don't buy Corsair.
Same error with these specs:
AMD Ryzen 7 3700x
Nvidia 2070 Super
2x 16 GB Corsair Vengeance RGB RAM
ASRock B550m Steel Legend Motherboard
Corsair 650x psu
I fixed the issue for a little while by going in BIOS and setting my CPU to 3600 MHz at 1.2 volts, but after attempting to adjust it again, any settings eventually cause reboot now.
Hey sorry to resurrect this thread but wanted to say what has (so far) worked for me. I was getting WHEAs from RDR2 only after about 30 mins to 1 hour, which is really odd, not from Cyberpunk or anything else. After these changes I played for about 4 hours straight no issues.
My Build:
ASUS TUF x570 Plus Wifi
Ryzen 9 5900x
Corsair Vengance 2x8GB DDR-4 3200MHZ
ASUS ROG RTX 3080
The fix for me:
Disabled DOCP, overclocked my ram manually to 3200 Mhz with IF of 1600, timings left on auto for the RAM, set my Power Idle to Typical Idle instead of auto, turned off C States (the low power saving states when CPU is idle), set a + Offset Voltage of .1 on the CPU, kept PBO on and turned BAR Off. Im now idling at 48c and maxing out around 74 in RDR2 max everything including raytracing at 1440p.
Hope this helps anyone! Check the TUF manual or google around for any terms i used.