Mainboard: MSI b450 gaming plus max, last bios
CPU: Ryzen 3600
RAM: hyper x dimm fury 3200 8x2 (in compatibility list, passed memtest)
Drive: cheap M.2 crucial (os) + seagate hard drive
Graphics: msi 1660 super
PSU: be quiet pure power 11 500w gold
OS: Win 10 Pro 64 updated
Chipset driver: 2.13.27.501 (last?)
So, this was my sad story of crash and reboots:
My ryzen is unstable under the stock normal settings, i often get reboots/crash loops until voltage get stuck at 1.4 on last core for some reason, i have no clue about this and no one was able to explain.
All my reboots/crash are at low use.
I believed to have found some solution from here, setting the multiplier at 36, that disabled all the boosting, voltage got at a chill 1.2-1.1 and then 5 good months of perfect stability.
Until last week, when reboots came back, here a dump from WhoCrashed:
On Tue 30/03/2021 18:16:19 your computer crashed or a problem was reported
crash dump file: C:\Windows\LiveKernelReports\WHEA\WHEA-20210330-1816.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x97E766)
Bugcheck code: 0x124 (0x0, 0xFFFFD60504F4E840, 0xBEA00000, 0x108)
Error: WHEA_UNCORRECTABLE_ERROR
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).
This is likely to be caused by a hardware problem.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.
The current ones are always the same, clean reboot, no blue screen, windows integrity ok.
I updated everything again, updated windows/chipset/bios, tried again at bios stock settings and reboot right away.
I have now disabled the precision boost overdrive. (it was always auto), so no more boosting over 3600.
The pc seems stable (for today), did some occt tests for power and cpu, voltage is even lower than under multiplier 36, got max 1.08 on vid single core, 1.1 on tfn and 1.125 vid effective from HWiNFO64.
I dunno, got the cpu on october 5, i have more or less still a year and half of warranty.
Now, what to do? And is cpu truly the problem here?
My dumps: https://1drv.ms/u/s!AnQ4g2Vrik9YggYGnIhtvdc2oagC?e=4FUf87
Again reboot after a week, under stock setting, core performance boost and precision boost overdrive disabled.
So, i shorted the mobo, factory stock now, and the pc is working with he boosting on, multiplier rise to 42 , i never had this stability, i don't get it.
Even before when i had some days of working with boosting (last year) i still had some cpu error when restarting/turning off the pc, now is clean.
i don't get it, i don't even know what my problem is anymore now.
So you are saying that after doing a CMOS CLEAR and resetting BIOS to it factory settings, your Ryzen is now working fine?
Most likely it could have been a BIOS setting you had set causing part of your problem or it could have been if you updated the AMD Driver which may have fixed your issue.
I would install Ryzen Master and see what it indicates. If you see anything in yellow or red.
Maybe if your update the motherboard's BIOS and Chipset driver might help with stability. Many times a new BIOS indicates it is for better system compatibility or stability.
Ryzen is working fine "today" but as i said i had weeks of perfect stabilty before.
I was unable to get the boosting on with the current newer bios and the update before, i also did not changed any settings but the ram freq, the core ratio or boosting disable.
I dunno.
I'm not gonna put ryzen master back on, look at my old topic: https://community.amd.com/t5/processors/ryzen-3600-mostly-reboots-and-some-crash/td-p/422417/page/5
Maybe you can explain me why ryzen master did that thing to the core number 6, plus the many settings change (to what?) after the reboots?
Anyway, i noticed that in the mobo psu cable (20+4 pin) the +4 pin was a bit loose, fixed it.
Can this be the cause of my WHEA_UNCORRECTABLE_ERROR?
Most definitely could cause your issue if the main 24 Pin Motherboard connection was loose or wasn't completely in.
So having a loose +4 of the 20+4 pin cable can cause clean random reboots every few weeks at time?
Man, i feel so stupid, i had so many troubles and tried alot of things.
Well, do i keep the bios at factory stock (except ram thing) and see how it goes?
Anyway, i got a second psu (a old AeroCool VP-650) to test with.
The Motherboard 24 pin plug is the Main connector. If it isn't making good contact in all 24 pins it can cause all sort of problems including not booting up
In my BIOS at first I did have everything factory set except like you, the RAM which I overclock to get the RAM's recommended speed.
Later I slowly started change a couple of settings one of which was disabling Fast Start which I didn't realize was enabled by default.
If you mean the fast start option in windows power settings, i have it disabled.
Some other improvements?
No, BIOS also has a setting for Fast Start. That is the one you need to disable if it is enabled.
For some reason that Bios setting causes many issues on User's computers.
I can't find the fast boot option in the bios.
Nothing in boot, on google found this:
But i only have the windows 10 csm option?
Anyway pc need something like... 20 seconds to get to desktop. (i don't have many extra programs and i also disabled some windows ones)
Booting from m2 and keyboard is working during boot. (i press f11 to get in bios)
I suppose i don't have it or is disabled?
Found this instruction from 2021 on how to locate Fast Boot in BIOS:
"Fast Boot is a feature in BIOS that reduces your computer boot time.
Found this instruction from 2021 on how to locate Fast Boot in BIOS:
Nope, i don't have it, those are my boot options, from msi site:
Not in boot, is supposed to be in OS settings but is not even there.
Well, i shorted the bios, maybe is in a update?
Here is a good example what happens when a critical or important Motherboard connector is loose:
I NEVER failed the system boot, at most i had a couple crash during windows load.
Little bit surprised since it seems like most Ryzen Motherboards have that BIOS setting.
The point I was trying to make concerning the Motherboard connections is that if not properly connected, loose or not making good contact, it can cause all types of issues including not booting up.
is your voltage set at 1.4v by default as it seems high. i have my 3700x set at 1.3v max cap and overclocked to 4.2ghz. steady as a rock.
is your voltage set at 1.4v by default as it seems high. i have my 3700x set at 1.3v max cap and overclocked to 4.2ghz. steady as a rock.
Ok please, check this:
from hwinfo64:
cpu core voltage (SVI2 TFN) from 1.050 to 1.475
Soc voltage (SVI2 TFN) 1.087 to 1.1
Effective CPU core VID is 1.450 to 1.475
single cores VID go from 0.9 (idle) or 0.2 (rest?) to 1.475 (4200mhz boosting)
Power reporting devation (accuracy) is 200+% for average
This is from 30 min or random use, so it can vary a bit
Also, from CPU-Z:
Core voltage goes from 0.200 idle to 1.100/1.300 (browsing/video/youtube) and over 1.400 for medium-high use.
This is all at stock factory bios and updated chipset, balanced ryzen power setting in windows.
Before, disabling boosting gave me 1.1 or 1.2 max volt and around 5-10° cooler temps.
Is all normal?
And it rebooted again after 14 days of perfect stability, factory bios settings, stock cpu boosting working well.
same as ever, clean restart:
On Fri 23/04/2021 00:41:01 your computer crashed or a problem was reported
crash dump file: C:\Windows\LiveKernelReports\WHEA\WHEA-20210423-0041.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x97E686)
Bugcheck code: 0x124 (0x0, 0xFFFFC58FA4EC1BD0, 0xBEA00000, 0x108)
Error: WHEA_UNCORRECTABLE_ERROR
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).
This is likely to be caused by a hardware problem.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.
Nome registro: System
Origine: Microsoft-Windows-WHEA-Logger
Data: 23/04/2021 00:41:05
ID evento: 18
Categoria attività:Nessuna
Livello: Errore
Parole chiave:
Utente: SERVIZIO LOCALE
Computer: DESKTOP-MF97J6R
Descrizione:
Errore hardware irreversibile.
Segnalato dal componente: core processore
Origine errore: Machine Check Exception
Tipo errore: Cache Hierarchy Error
ID APIC processore: 13
Nei dettagli della voce sono disponibili informazioni aggiuntive.
XML evento:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-Windows-WHEA-Logger" Guid="{c26c4f3c-3f66-4e99-8f8a-39405cfed220}" />
<EventID>18</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000000</Keywords>
<TimeCreated SystemTime="2021-04-22T22:41:05.1590684Z" />
<EventRecordID>45354</EventRecordID>
<Correlation ActivityID="{45c6dc4f-8923-4cfc-b666-856129e0fee4}" />
<Execution ProcessID="3300" ThreadID="3780" />
<Channel>System</Channel>
<Computer>DESKTOP-MF97J6R</Computer>
<Security UserID="S-1-5-19" />
</System>
<EventData>
<Data Name="ErrorSource">3</Data>
<Data Name="ApicId">13</Data>
<Data Name="MCABank">5</Data>
<Data Name="MciStat">0xbea0000000000108</Data>
<Data Name="MciAddr">0x1f80757dfd212</Data>
<Data Name="MciMisc">0xd01a0ffe00000000</Data>
<Data Name="ErrorType">9</Data>
<Data Name="TransactionType">2</Data>
<Data Name="Participation">256</Data>
<Data Name="RequestType">0</Data>
<Data Name="MemorIO">256</Data>
<Data Name="MemHierarchyLvl">0</Data>
<Data Name="Timeout">256</Data>
<Data Name="OperationType">256</Data>
<Data Name="Channel">256</Data>
<Data Name="Length">936</Data>
<Data Name="RawData">435045521002FFFFFFFF03000100000002000000A80300003B281600160415140000000000000000000000000000000000000000000000000000000000000000BDC407CF89B7184EB3C41F732CB57131FE6FF5E89C91C54CBA8865ABE14913BB05A9DD8FC837D70102000000000000000000000000000000000000000000000058010000C00000000003000001000000ADCC7698B447DB4BB65E16F193C4F3DB0000000000000000000000000000000001000000000000000000000000000000000000000000000018020000800000000003000000000000B0A03EDC44A19747B95B53FA242B6E1D0000000000000000000000000000000001000000000000000000000000000000000000000000000098020000100100000003000000000000011D1E8AF94257459C33565E5CC3F7E8000000000000000000000000000000000100000000000000000000000000000000000000000000007F010000000000000002010000000000100F87000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000D00000000000000000000000000000000000000000000000000000000000000000000000000000007000000000000000D00000000000000100F870000080C0D0B32D87EFFFB8B170000000000000000000000000000000000000000000000000000000000000000F50157A5EFE3DE43AC72249B573FAD2C03000000000000009F0002060000000012D2DF5707F8010000000000000000000000000000000000000000000000000002000000020000002B3FF290C837D7010B0000000000000000000000000000000000000005000000080100000000A0BE12D2DF5707F8010000000000FE0F1AD0000000000D00000000000000B00005000000004D0000000079000000230000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001B00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000</Data>
</EventData>
</Event>
So sad, i was almost sure it was the loose 20+4 pin cable.
will try to change the psu now.
1 crash in 14 days isnt really major. your also using cpu boost. have you tried it without? it looks more software than hardware.
i am messing with my pc all the time OC and it crashes so i have lower the settings.
what were the temps when it crashed
Wait, is it normal for ryzen to have reboots at stock? it can't be.
I always get those reboots during idle-low use (streaming, low spec games like league of legends, surfing web)
Temps are... fine? From HWiNFO64 is have now 39-44° (if i disable boosting i get 40something and 35 idle), it goes 50-60 for streaming/low games and 70+ during heavy games. (but i don't play those much, maybe is just chance i never rebooted during those)
Still i never had any problems in bios or during stress tests like OCCT, no errors ever. (also done memtest86 and more)
Stock cooler but big nice case with good airflow, the gpu is also a pretty chilly one.
Not really sure about the actual temp during the reboots but if that's the reason, it must be some sudden huge spike.
Now, someone here suggested me a 36 multiplier, that worked very well for like 3-4+ months, but then reboots started again.
I was also unable to put boosting back on, reboot after windows load.
Did some updates again, some days working ok and then reboot.
Then i shorted the mobo, factory stock settings (only ram thing on), boosting worked fine and no problem for the last 14 days. (the normal stock boosting, no oc)
Now i updated the bios to the newest one out and i will try to manage the cables a bit better, after the next reboot i have a old working psu to try. (but since boosting is now working, i have now no way to reproduce/force those reboots)
At least, bit a bit the stability is improving.
well, rebooted again after 3 days, and rebooted a second time at windows load 18 sec after.
Then pc worked fine, no windows corruption, no blue screen, same whea logger error.
So, someone suggested me to set "typical current idle" under cpu settings in bios, so the cores stop to sleep at very low voltage and fail to wake up. (This "seems" to be my problem...?)
Let's try that.
I also disabled core boosting to keep the temps down.
And finally got the second psu to try.
Suggestions?
you sure you dont have a virus.
you tried the power settings, performance rather than sleep etc
No virus or anything, no sleep or ibernation, secondary hard disk always on, pretty sure i had reboot also on ryzen performance.
I have kinda the same problems. What happened eventually?
Here is a good example what happens when a critical or important Motherboard connector is loose: https://community.amd.com/t5/processors/q-code-00/m-p/463419#M40164