I setup a new PC earlier this year in Feb. I primarily use the PC for coding and casual internet browsing. I bought a Ryzen 7900 so i can run my python script with multi-processing.
Problem: After using the PC for 4 months with no issues, my PC suddenly started crashing frequently - sudden black screen crash follow by CPU fan running at 100%.
It crashes after logo and gets into a bootloop. I can only get into windows after windows recovery is loaded.
It crashes during light browsing after i got into windows.
It crashes immediately at the start of any CPU test (CPUz, occt, etc.)
Ironically, when i run my python script at ~70% high CPU load, the PC will NOT crash for the whole day. But it will crash soon after i stop the script.
I'm using default BIOs setting with the following hardware setup -
AMD Ryzen 9 7900
Asus ROG STRIX B650E-I motherboard
G.Skill Flare X5 Expo 2x16GB DDR5 6000 CL36 (F5-6000J3636F16GX2-FX5)
Palit GTX 1070 JetStream
PSU: Silverstone SFX 500W and Cosair SF750
Fan: Noctua NH-L12S and stock
XTIA Xproto-N open casing
Temporary Solution
After testing various overclocking and underclocking configuration, i managed to stabilize my PC and stop the frequent crashes with the following bios configuration -
Reducing EDC from from 150 (Ryzen 7900 stock) to 120 managed to stop the frequent crashes. I tested different EDC limits from 200, 150 (stock), 140, 130, and finally 120. 120 is the sweet spot that stops the PC from crashing.
Before reducing EDC, initiating any CPU test will crash the PC. After update bios with EDC 120, my PC can now complete most CPU benchmark test such as Cinebench, OCCT and CPUZ. However, the OCCT CPU + Ram test will still crash the PC.
In search for a better solution
IMO this is a compromise, not an ideal solution as reducing EDC will reduce the CPU performance by ~20% at high load; my python app is running ~20% slower
I'm new to this, if reducing EDC fixes the frequent crashing problem, does it mean that the underlying cause of those crashes were exceedingly high peak ("spike") current triggered from the motherboard or CPU?
To AMD experts and users who fixed similar problems - Am i missing anything? is there a better solution to fix frequent black screen crashes?
------------------------
More details on my troubleshooting journey
Hardware troubleshooting
After trying various fixes and swapping my hardware, the problem still persist. Here's what i did -
Reformat and clean install of windows 11
Updated my BIOs and motherboard broke with persistent red light (not sure if this problem is related)
RMA and got a new motherboard with latest Bios
RMA and got a new CPU
Bought a new PSU, upgraded Silverstone 500W to Cosair SF750
Reseated my Ram, tested with single ram and double ram in all combinations
Tested with both onboard graphic and my external 1070 gpu
Tested both stock fan wraith prism fan and Noctua NH-L12S
I suspect ram incompatibility, but g.skill flare x5 is listed as a compatible ram on Asus B650e-i website. I prefer not to buy another pair of ram to test unless I'm certain this is the problem.
Software troubleshooting
Bios - Tried all AMD expo profiles. Doesn't work
Bios - Enable/disable Memory Context Restore and Power Down Enable settings. Doesn't work
Bios - Disable Power Supply Idle Control. Reduce crash frequency, but PC will still crash within 1-2 hours usage.
Bios - Disable Global c-state control. Reduce crash frequency, but PC will still crash within 1-2 hours usage.
Bios - Disable Precision Boost Overdrive
Windows - Installed the latest hardware drivers
Windows - Disabled all sleep options
Windows - Disabled onboard GPU in device driver
Windows - Did not install Asus Crate in windows (i read that crate may cause crashes).
All of the above software and hardware troubleshooting failed. My PC still crashes frequently.
Benchmark and tests
All benchmark test will crash at EDC = 150. After reducing EDC to 120, here are the test results -
Cinebench CPU multi-core - 1379 (vs 1632 by cpu-monkey)
Cinebench CPU single-core - 109 (vs 116 by cpu-monkey)
OCCT CPU stability test
OCCT CPU benchmark test - single sse: 196, multi sse: 1155, single avx: 207, multi avx: 2056.
OCCT memory benchmark test
CPUZ bench CPU - single 759, multi 11127 (vs single 780, multi 12106 by CPUZ 7900 benchmark)
The follow tests will still crash the PC