Hello everyone,
I have a problem with my new build PC.
It crashes after benchmarks and in games. If I play games like Farming Simulator 22 or do benchmarks like OCCT, Windows shows BSOD and the PC restarts. Sometimes I got the massage "CPU overheating alert! Please check CPU cooler is firmly attached for working properly.". The benchmarks themselves run without any problems, but just after stopping the benchmark the PC restarts.
I'm using the AMD boxed cooler and didn't overclock my CPU.
If I set the PPT to 45W the PC runs without restarts, but if I go 60+ Watts the PC crashes.
In FS22 the PC crashes at 65-70°C and in benchmarks it runs on every temperature up to 95°C but only crashes after stopping the benchmark.
What I've tested:
reinstalling Windows
reinstalling CPU cooler with new thermal paste
updated BIOS
tested XMP on/off
newest AMD chipset drivers
changed PSU from Thermaltake Smart BX3 650W to Be Quiet Pure Power 12M 850W
My system:
AMD Ryzen 5 5600X with boxed cooler
MSI B550 A-Pro
AMD Radeon RX 7700XT 12G (Gigabyte OC)
Kingston Fury Beast DDR4 16GB dual kit 3600MHz
Kingston NV2 NVMe PCIe 4.0 SSD 1TB
Be Quiet Pure Power 12M 850 W
AeroCool Designer V1
Windows 11 Pro
I'm sorry for imperfect english, it isn't my mother tongue.
Thank you for any help.
Is the AMD stock cooler really so bad, that it can't handle the heat of of my CPU?
Or can I change any settings to solve my problem?
The 5600X comes with the Wraith Stealth, the lowest end version of AMD box cooler. However it should still be adequate to keep the CPU under 95°C and the default PBO power and thermal limits take care of the rest.
The message you report, "CPU overheating alert! Please check CPU cooler is firmly attached for working properly." is not an AMD generated alert. This is likely being generated by some other software, possibly a utility that came from your motherboard manufacturer?
If you are running software from your motherboard manufacturer or any other third party utility that monitors system temperatures check to see if there is a high temperature warning setting and either disable it or set it higher than your max temp of 95°C.
Or purchase a better cooler.
I am with you, FunkZ! I would go further and advise the OP to uninstall any utilities especially those from the MB vendor. Install the latest Ryzen Master (RM) from here. Post a screenshot of RM running Cinebench R24 multicore. Do a Clear CMOS before running the test but remove all poor quality utilities. John!
Hi, I've completely reinstalled Windows and I've omitted every unnecessary utilities.
I've posted some screenshots of my testruns.
It looks like if I enable PBO manually in the BIOS the system goes so hot that the cooler can't handle the temperatures and after stopping the benchmark my PC crashed.
But if I have PBO on Auto it runs fine in this benchmark.
@LP-700 your initial post stated you "didn't overclock my CPU"
Your first Ryzen Master screenshot shows when your PPT is limited to the stock 76W the processor is stable. It limits power which keeps temperature under control.
The second Ryzen Master screenshot shows you have raised the PPT to 1000W which effectively removes the power limit, allowing the CPU to boost up to 95°C which is not stable. This is overclocking your CPU!
Either leave the BIOS settings stock and enjoy a crash-free stable system, or purchase a better cooler if you want to try overclocking. To reduce temperature you can also try setting Curve Optimizer to a negative value to reduce voltage.
This isn't 100% correct. Firstly, I didn't know that switching the PBO via one button click from Auto to Enabled counts as overclocking. But seccondly I've posted another reply which shows a picture of another benchmark called OCCT and there you can see, that the temperatures are rising to around 80°C and after stopping the benchrun my PC crashed but it was limited to 76W. So it isn't crash free
Disable the boost in the bios otherwise this cpu will melt itself and the mainboard to death. Those high temps can cause instability under certain circuumstances. Amd decided to make the cpu boost to and sometimes above the max temp - and the performance gain vs no boost is like 15% at best. Its not worth it. Also 90+ C will significantly shorten the life of the cpu itself and the mainboard because all the condensators around the cpu will be under higher temperatures resulting in faster drying up which means then failing. Under good temperatures a mainboard can last up to 10+ years - but when you run a cpu like this with boost the mainboard will probably last 3-4 years at best - and if it has a little quality issues due to some older condensator used during manufacturing you look to below 2 years - specialy asus mainboards are known to fail early. But asrock mainboards (which belongs to asus) never failed me and ran strong for many many years! (Probably due to the higher sales count - similar to buying vegetables in a cheaper store results often in fresher ones.
gulabon2, do you speak for AMD? How do you know what processor temperature will shorten the life? I suspect you do not know and suggest you refrain from making absolute statements. Enjoy, John.
I've also runned another benchmark which stresses nearly all components of a PC. The temperatures slowly rise to 80°C and seem to stabilise at this temperature. After I stopped the benchmark the PC crashed. Are 80°C really that dangerous for the CPU?
LP-700, I doubt it is the temperature but cannot guess with no evidence. Please look in the Event Viewer and post screenshots of a few of the Critical Error Details tab. Please post no more images of Cinebench. We've seen a zillion of them, simply post the score. Your processor is throttling due to PPT and EDC but also temperature in one case. Have you modified the limits of these values? Do a Clear CMOS and stay out of BIOS or any OCing. Some of this may be a remainder of the crummy utilities you have been running. John.
There are many of these first Errors the others there are only one or two.
A Clear CEMOS means to remove the battery from the mainboard - right?
Do I have to update the BIOS after this and do I need to reinstall Windows?
Thanks, LP-700. I need to translate to English to make sense of it. The first term (fehler) translates to Mistake. I think this is Error not Critical.
Is the same as Critical? Please use Critical in the search. And post a screenshot of the Details tab at the bottom. Clear CMOS should just delete the changes to the BIOS and you should not need to install a fresh copy of BIOS or Windows. The Clear CMOS procedure should be in the MB manual. My board has a button on the back to do a Clear CMOS, yours may not. I never know about these system builders and what they do. John.
Oh sorry that was my fault, I have attached the pictures below with Critical.
I have a flash button on the back of my MB, could this be the same?
The Critical Error in the first picture, I've got about 20-30 times, the other one only once.
Thanks, LP-700. OK, we are making progress. You are getting "4502 WinREAgent". Which is probably taking your system down. I will do some more looking at these errors but please ask your builder about this.
Are there any restrictions on what you can do like installing Windows or clearing BIOS? Please recreate the error and filter the output for the Critical errors and post a few, including the Details tab. Let's see what we can learn. Thanks, John.
My builder is me and he has no glue about this anymore
Clearing BIOS and installing Windows I've done before but it haven't helped me. If I dont't missunderstand you.
I have already posted the critical Errors, they are all saying the same besides this one critical i've posted.
Hey I've done a clear CEMOS but it hasn't solved the problem.
In the first testrun, my PC haven't crashed (this run was with case open, temps around 70°C max 74°C).
In the seccond one my PC crashed again (this time with case closed, temps around 80°C max 82°C)
How can it shut down at 80°C, Windows showed "WHEA_UNCORRECTABLE_ERROR"?
LP-700, I am here but real tied up. I need to see some of the Event Viewer BASIC tab, and a shot of RM when at about 80C and up, I will be back to thinking soon. John.
I'm ging to send the pictures on monday, because over the weekend I don't have time to do testings.
What do you mean with Basic Tab?
Thanks for helping
Hey, I've done some more testings. Firstly I had to do another clear Cemos because of my PC was constantly crashing under load after 1-2 minutes. After the clear Cemos my PC runs a Bit better but if I run a "Combined" test in OCCT the PC crashes again after a few minutes.
I have runned a only grafics card test - without any crashes. I have done a only CPU and Memory test - without any crashes. The PC only crashes when I combine the use of CPU, Ram and GPU.
I don't know what do you mean with Event Viewer Basic tab, could you please explain?
Thanks, LP-700. These look fine. You are throttling due to PPT and EDC. Here is a way to post a few Critical Errors. Right click This PC-Click Manage-Click Event Viewer-Expand Event Viewer-Double click System, In the right plane click Filter current log...-Select Critical then OK-
Find a recent few errors and select Details in the lower pane-When it opens right click in the data and Select All-Right-Right click and select Copy. Paste a copy of these in your response.
So far we do not know what is causing the crashes and should concentrate here, so please quit trying "fixes" provided to you do Clear CMOS, do not run any utilities except RM and stay out of BIOS. Thanks, John.
Hey, thanks for the tutorial.
First Error: ID 41 (I've got this 43 times)
- | System |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- | EventData |
BugcheckCode | 0 |
BugcheckParameter1 | 0x0 |
BugcheckParameter2 | 0x0 |
BugcheckParameter3 | 0x0 |
BugcheckParameter4 | 0x0 |
SleepInProgress | 6 |
PowerButtonTimestamp | 0 |
BootAppStatus | 3221225684 |
Checkpoint | 16 |
ConnectedStandbyInProgress | false |
SystemSleepTransitionsToOn | 1 |
CsEntryScenarioInstanceId | 0 |
BugcheckInfoFromEFI | false |
CheckpointStatus | 0 |
CsEntryScenarioInstanceIdV2 | 0 |
LongPowerButtonPressDetected | false |
LidReliability | false |
InputSuppressionState | 0 |
PowerButtonSuppressionState | 0 |
LidState | 3 |
Seccond Error: ID 100 (3 times)
- | System |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- | EventData |
hc_stateid | 0 |
Third Error: ID 4502 (1 time)
- | System |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- | EventData |
ErrorPhase | 2 |
Errorcode | -2147467259 |
The first Error is the latest one I've got, the other Errors are older.
I hope it helps a Bit at finding the Problem.
Thanks, LP-700. Please look for Critical Errors that are not ID 41 and post a few. Select Details tab and select Friendly View. Also look in C:\Windows\Minidump and if it is not empty, compress it and upload it to where I can DL it. Thanks, John.
Hey, the only other Errors I have are ID 4502 and 100. I don't know what "Friendly View" is, could you please send me a picture of where it should be, or give me another tutorial ;).
I've got 5 Minidumps (or at least I think I've got) and uploaded theam on
https://www.swisstransfer.com/d/1f2c0cf1-9834-4d73-91a1-fe4deed71a23 .
Please let me know if those could help me out.
Thanks, LP-700. All five of the dumps are Bug Check 0x124: WHEA_UNCORRECTABLE_ERROR with parameter 1 of 0x0 meaning 'A machine check exception occurred.' I suggest you contact AMD for a replacement processor. Request an RMA here.
Friendly View:
John.
Okay, so I've posted the Friendly View of the Errors in my post yesterday.
So you think my CPU is broken and I've got a production Error model?
Currently, if I do a benchmark of my complete PC, it crashes after 1-2 minutes. It crashes in form of:
1. screen freezes/
2. or blacksreen and the fans of the graphics card stop spinning, LED's of mouse and keyboard turn off (everything else continues working)/
3. or the PC restarts.
Those three events happen randomly, I can't tell which will happen.
Can these failures be caused by the CPU?
Hey, I've sent the CPU to RMA and will report, if I got news.
Thanks, LP-700, John.
Hey, I've got a new CPU from AMD.
Is there any thing to do before/when I intall the new CPU like doing a clear CEMOS reset or resetting the drivers?
Great, LP-700. You could do a Clear CMOS. Then install the processor and see if all is well. John.
Hey, I did a Clear CEMOS and installed my new CPU.
But the crashes continue.
So I think I'am going to buy a new mainbord, to test if this is my problem.
Sorry that did not correct it, LP-700. Do you not want to RMA the MB? John.
I think I'am going to buy a new one for testing, because its faster to get a new one than to RMA the one I have. If the new MB solves the problem I'am going to RMA my "old" motherboard because I have warranty on it.
Thanks, LP-700. John.
Hey, I've asked the MSI support and there answer was:
"Increasing the memory voltage by 0.05V can help here.
Therefore, set the DRAM voltage in the BIOS under OC with activated XMP, e.g. from 1.20V to 1.25V, or if the memory specifies 1.25V, set 1.30V.
If this is also unsuccessful, you can also try the memory with the
MSI BIOS option “Memory try it”, select the option in the BIOS under OC and select the memory clock.
Please deactivate XMP beforehand.".
What do you think, should I try this or is it to "dangerous" so that I destroy my Ram?
LP-70, worth a try. I do not think it is dangerous. Sounds like they are aware of a MB problem. Have you ever tried an XMP or EXPO profile? Here is a utility worth trying to see what values these schemes will set. DL it here. Use only the free version and click the 'Read' button. The free version will only read the SPD memory on the memory sticks. It is called Thaiphoon Burner but the free version on reads. Post a screenshot. Thanks, John.
Yes, I've tried XMP (because I don't have a EXPO Kit) but it didn't helped me solving the problem. I've done some screenshots, hopefully they will help you.
Thanks, LP-700. The XMP profile set the voltage to 1.35 Volts so 1.25 will be fine. I suspect it will not correct your problems. Have you received your new MB? John.
Hey, I have a new tip ....
It seems that one of my ram bars is defective, because if I install only one instead of 2 bars, my PC boots fine with one bar in each slot and with the other one it doesn't boot in any slot. If I install both, it also fails to boot. I also did a stress test on the one RAM that is working and my PC survived that without any problems. So I will RMA my RAM.
PS: I will receive my motherboard tomorrow, but I think I will not need it any more
I will keep you updated when I got new Ram.