Hi All,
I'd like to start by saying I have been trying various fixes from a wide variety of forums over the course of the last 3 weeks to try and fix this issue.
Issue: Randomly seemingly regardless of what application or game is open (even if there isn't one open) my pc seems to reboot randomly. It happens quickly and I have not seen a BSOD. I have started savings all of the event logs for each crash.
Build: This is a brand new build, the only pre-existing part is the PSU which I will replace just in case. Although the PSU had no issues in the previous build.
GPU: RTX 4060ti
CPU: Ryzen 9 5900X
MOBO: Asus B550 - F - Wifi
Ram: 36gb (2x16gb) Corsair Vengeance RGB RS
PSU: EVGA 650 P 6
Storage: Various PCi SSDS and 1x WD Black HDD
The constant ERROR Message over the last few weeks is this:
A fatal hardware error has occurred.
Component: Memory
Error Source: Machine Check Exception
However this evening's crash had 2 errors showing:
A fatal hardware error has occurred.
Component: Memory
Error Source: Machine Check Exception
AND
A fatal hardware error has occurred.
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Bus/Interconnect Error
Processor APIC ID: 0
The main fixes I have implemented:
- New RAM
- Ram running XMP and AUTO
- Ram increased Power by 0.05
- Full OS wipe with storage formatting
- I haven't tried disabling CPB or PBO as from my understanding that isn't a fix. If reducing the CPU's function to that of a much cheaper card is the fix....then I would just buy the cheaper card
- I did uninstall RIOT's anti-cheat and all RIOT programs as windows did have this listed as a potential cause for reboots
I do see people saying they have had success getting a new CPU. I would prefer to exhaust all other possibilities before resorting to an AMD RMA. Thanks in advance!
( 1 positive out of this experience is that I have learnt alot about the various parts/software xD )
EDIT 1. Without changing anything I have had 4 reboots in 15 minutes. That is the most I have experienced in a single day. I will be disabling CPB/PBO to see if that stabilizes it so I can continue to use this pc while looking for a more permanent solution.
Solved! Go to Solution.
Ok SO: the permanent fix was: changing my cpu voltage from 1.088 to 1.2
That's it lol. After all this time that's it. Hours of trouble shooting, multiple parts replaced xD I hope this helps someone else!
Just out of curiosity what's your Ram speed set at?
I currently have XMP enabled so it's on 3600MHz. I can't remember what the default was before using XMP
Is the B550-F WiFi motherboard the ROG Strix Gaming version?
What BIOS version is it on? Is there a newer BIOS version?
You listed memory as 36gb (2x18gb) are you certain it is not 32GB (2x16)?
What is the memory part number? Have you checked if the memory is on the motherboard QVL list?
Hi Funkz,
Thanks for the reply. I've amended the RAM typo. It was meant to be 2x16gb.
Yes the MOBO is the ROG Strix gaming version.
The BIOS version is 3607
RAM part number is:
CMG32GX4M2D3600C18
I'm not 100% sure if I'm searching correctly, but it doesn't seem to be on the list. Is it really possible that by not being on the recommended list it would cause the constant reboots? I'm surprised the local tech shop that did the build didn't check this if it is a potential issue.
Thanks for the help so far!
Thanks for confirming the motherboard is the ROG STRIX B550-F GAMING (WI-FI) and the BIOS 3607 is the latest version available.
I do not see that memory part number on the list either, although there are other Corsair 2x16GB kits of various other speeds listed. However it means that Asus has not validated that memory in this board. Corsair lists it as compatible for AMD 500 series.
For troubleshooting try to remove one of the memory DIMMs and see if the issue persists with only one stick installed. If the problem continues also try the single stick in another slot to see if that could be motherboard related.
As you have already tested the memory at both XMP and SPD, and tried increasing memory voltage, and tried replacing the memory once (assuming with an identical kit?) the last thing to try would be replace the memory with a different kit that is on the QVL. That would for sure rule out the memory as the problem.
Also curious if you are using the iCue software for the memory RGB?
Sorry this took a little longer to test as the crashes can sometimes be hours in-between.
I still had a crash after using 1x stick of ram and trying different slots.
I have trialled disabling CPB and PBO and its been stable with no crashes. Ill try another day of use to see if this is working long term. If this does work, what does that likely mean the main issue is. It must be the CPU right?
That is interesting, since the errors seem to implicate the memory. The 5900X is a chiplet design, with the cores separate from the IMC/I-O die. The core boost does not alter the IMC or Fabric operating frequency, which remains constant.
I would be curious if you reenable CPB/PBO and add a small positive CO (Curve Optimizer) value, +5 or +10, if that bump to core voltage allows the CPU to boost without crashing.
Elbastardo, please let us know the results of the one memory stick test. If you are running iCue, please uninstall it and do a Clear CMOS. We need to see the Event Viewer of a few of the Critical errors, Details tab. John.
I've checked and I don't have iCue installed or running, I have also cleared CMOS multiple times already throughout the trouble shooting process
Is the XML format for details fine?
try de-clocking the ram and see if your problem goes away
The current status is this: disabling the PBO and CPB has stabilised the system. It's 3 days in now without a single crash. To 100% confirm this has fixed the issue I'll be looking for any reboots for another 2-3 days.
If there are no reboots, im not sure what my next step is. If the CPU needs to be functioning at half capacity to be stable, could it be a PSU problem still?
Ill trial Funkz suggestion of re-enabling PBO and CPB and adding the CO. Ill trial this in a few days time and let you all know how I go. If re-enabling the PBO and CPB causes the system to become unstable is it a faulty CPU?
Thanks for the help so far!
It would seem so if the CPU is not stable at default BIOS settings, unless for some reason the motherboard is not providing enough voltage. Which is typically not the case, if anything most board manufacturers seem to like to overvolt the processor slightly. But that's something else to look at as well, what vCore range it's running.
I had an issue just like this years ago I bought ram that was not compatible with the motherboard I had at that time and after I replaced the ram with a compatible set all my problems were gone and this really seems like your issue however, I would suggest you keep testing to make sure!!!
Hi Addie,
Thanks for the input! I really thought it was my ram too, but considering the only fix that has worked is CPB and PBO disabled points towards CPU. I have since re-enabled both CPB and PBO and manually adjusted my "VDDCR CPU voltage" from 1.088 to 1.2. I've seen a few people with the exact issues and they increased the voltage and all of their issues disappeared. If these CPU related fixes don't work in the long term, then I'll look into the compatibility issue with my RAM
Awesome do keep us posted on how your system reacts to your new settings overtime because Johnny 5 needs input lol.
Ok SO: the permanent fix was: changing my cpu voltage from 1.088 to 1.2
That's it lol. After all this time that's it. Hours of trouble shooting, multiple parts replaced xD I hope this helps someone else!