Hello all,
I have a new Ryzen 9 3900X desktop build that has been extremely unstable, exhibiting numerous crashes, BSoDs, and subsequent refusals to boot Windows. Info and minidumps are below.
I'd be very interested if other people using similar hardware have had similar problems, or if any super-sleuths can look at the attachments and point me in the direction of a fix. The oddest thing about my problem seems to be its correlation with lightly-loaded operation; the system runs seemingly quite well under moderate to heavy CPU / GPU loads or CPU alone, but is really unhappy with unloaded operation.
System information (more is available in attached screenshot from HWiNFO):
Win10 minidumps:
061520-6250-01.dmp - Google Drive
061620-5875-01.dmp - Google Drive
Thanks for any assistance!
-Carl
Solved! Go to Solution.
Update and resolution, 7/6/2020
I ordered a brand-new Ryzen R9 3900X processor and swapped for the existing one in the unstable build described above, resulting instantly in stable operation. So the problem was the original Ryzen R9 3900X processor. I applied for warranty coverage from AMD. Now comes the fun part, seeing how long it takes them to respond and what kind of hoops they want me to jump through to replace that chip.
Obviously, as a troubleshooting strategy, CPU replacement is a costly approach that typically only gets tried when everything else has failed (and that's what happened for me). I'm not used to defective CPUs, and my hardware errors were never specific enough to unambiguously determine that the CPU was at issue. Consequently, I have enough new hardware (graphics cards, NVMe storage, RAM, etc.) sitting around now to build a second PC!
Thanks to those who offered advice, and my hope is that others with similar difficulties see this and don't rule out defective CPUs in your troubleshooting workflow.
-Carl
Hello, if you check my request, I have exactly the same problem, but for a 2600X processor. It works perfectly fine with high load, however when starting a computer cold after a couple of minutes when I just browse the internet it tends to restart. After that it will work perfectly, especially when having a high load.
I also get BSOD with the participation of the same LKD_0x141_Tdr:6_IMAGE_amdkmdag.syss on the rx vega 64 video card. W10 \ x64 \ 2004 \ adrenalin 20.5.1
Not an expert with crash dumps.
The second crash dump has the following error.
WHEA_UNCORRECTABLE_ERROR (124)
Process that caused it was chrome.exe.
And the hardware module that shut down was your CPU.
And here is some advice I shamelessly copied off the internet about troubleshooting the issue.
Stop 0x124 is a hardware error
If you are overclocking try resetting your processor to standard settings and see if that helps.
If you continue to get BSODs here are some more things you may want to consider.
This is usually heat related, defective hardware, memory or even processor though it is"possible" that it is driver related (rare).
Stop 0x124 - what it means and what to try
Synopsis:
A "stop 0x124" is fundamentally different to many other types of bluescreens because it stems from a hardware complaint.
Stop 0x124 minidumps contain very little practical information, and it is therefore necessary to approach the problem as a case of hardware in an unknown state of distress.
Generic "Stop 0x124" Troubleshooting Strategy:
1) Ensure that none of the hardware components are overclocked. Hardware that is driven beyond its design specifications - by overclocking - can malfunction in unpredictable ways.
2) Ensure that the machine is adequately cooled.
If there is any doubt, open up the side of the PC case (be mindful of any relevant warranty conditions!) and point a mains fan squarely at the motherboard. That will rule out most (lack of) cooling issues.
3) Update all hardware-related drivers: video, sound, RAID (if any), NIC... anything that interacts with a piece of hardware.
It is good practice to run the latest drivers anyway.
4) Update the motherboard BIOS according to the manufacturer's instructions.
Their website should provide detailed instructions as to the brand and model-specific procedure.
5) Rarely, bugs in the OS may cause "false positive" 0x124 events where the hardware wasn't complaining but Windows thought otherwise (because of the bug).
At the time of writing, Windows 10 is not known to suffer from any such defects, but it is nevertheless important to always keep Windows itself updated.
6) Attempt to (stress) test those hardware components which can be put through their paces artificially.
The most obvious examples are the RAM and HDD(s).
For the RAM, use the in-built memory diagnostics (run MDSCHED) or the 3rd-party memtest86 utility to run many hours worth of testing.
For hard drives, check whether CHKDSK /R finds any problems on the drive(s), notably "bad sectors".
Unreliable RAM, in particular, is deadly as far as software is concerned, and anything other than a 100% clear memory test result is cause for concern. Unfortunately, even a 100% clear result from the diagnostics utilities does not guarantee that the RAM is free from defects - only that none were encountered during the test passes.
7) As the last of the non-invasive troubleshooting steps, perform a "vanilla" reinstallation of Windows: just the OS itself without any additional applications, games, utilities, updates, or new drivers - NOTHING AT ALL that is not sourced from the Windows 7 disc.
Should that fail to mitigate the 0x124 problem, jump to the next steps.
If you run the "vanilla" installation long enough to convince yourself that not a single 0x124 crash has occurred, start installing updates and applications slowly, always pausing between successive additions long enough to get a feel for whether the machine is still free from 0x124 crashes.
Should the crashing resume, obviously the very last software addition(s) may be somehow linked to the root cause.
If stop 0x124 errors persist despite the steps above, and the hardware is under warranty, consider returning it and requesting a replacement which does not suffer periodic MCE events.
Be aware that attempting the subsequent hardware troubleshooting steps may, in some cases, void your warranty:
8) Clean and carefully remove any dust from the inside of the machine.
Reseat all connectors and memory modules.
Use a can of compressed air to clean out the RAM DIMM sockets as much as possible.
9) If all else fails, start removing items of hardware one-by-one in the hope that the culprit is something non-essential which can be removed.
Obviously, this type of testing is a lot easier if you've got access to equivalent components in order to perform swaps.
Should you find yourself in the situation of having performed all of the steps above without a resolution of the symptom, unfortunately the most likely reason is because the error message is literally correct - something is fundamentally wrong with the machine's hardware.
Update and resolution, 7/6/2020
I ordered a brand-new Ryzen R9 3900X processor and swapped for the existing one in the unstable build described above, resulting instantly in stable operation. So the problem was the original Ryzen R9 3900X processor. I applied for warranty coverage from AMD. Now comes the fun part, seeing how long it takes them to respond and what kind of hoops they want me to jump through to replace that chip.
Obviously, as a troubleshooting strategy, CPU replacement is a costly approach that typically only gets tried when everything else has failed (and that's what happened for me). I'm not used to defective CPUs, and my hardware errors were never specific enough to unambiguously determine that the CPU was at issue. Consequently, I have enough new hardware (graphics cards, NVMe storage, RAM, etc.) sitting around now to build a second PC!
Thanks to those who offered advice, and my hope is that others with similar difficulties see this and don't rule out defective CPUs in your troubleshooting workflow.
-Carl
I Have same Issue.
I have abrand new system Asus B550 Tuf gaming + Ryzen 3900XT + watercooling + 32gb ram.Corsair 3600 + Radeon 5700 8gb + 850w PSU + Corsair MP600 1TB + Samsung Evo plus 970 1TB
I use no OC, Bios settings are default.
I updated the bios to the last version.
My firste installation was well, but not for long.
My temps was between 34-42°. I notice when the system is on heavy load it hangs after a few seconds.
The system is very instable
I try to reinstall windows 10 2004, i get everytime BSOD now.
WHEA_UNCORRECTABLE_ERROR.
I try a old windows 10 1909 same problems
It's realy strange, why did it work the first time ?
Maybe my cheap PSU is not doing the work well maybe. (a new is on the way 850w Corsair 80+ Gold)
I don't know where to start to search.
I will try tomorrow with other memory and i see if i can have it run stable
Anyone any ideas where to start?
Found the problem.
My system had 4 x 8 gb ram ddr4 3600 Corsair
Just removed 2 sticks of ram and it worked and stable.
Have done stress test, Cinebench etc...... it passes them all without any freezing or BSOD.
I also have done fine tuning on the bios for the ram speed and stressed again and it works perfectly with 2 Ram sticks occupied
Now i have to check if it is a Mainboard issue or Ram issue