Terms
Shutting off = black screen, PC still on (people on Discord can still hear me, keyboard still works), card's light turns from blue to red.
Black screen = same as "Shutting off".
Reseat = get the component out of the motherboard and place it back where it was
Context before this started happening:
In a single day (11 September 2024):
- I updated the Adrenalin version from 23.11.1 to the latest one 24.1.1;
- Windows updated from 19045.4842 to 19045.4894;
- Installed a newly bought M2 SSD next to the PCie card. In order to install the M2, I had to reseat the card (because I needed space for my fingers to catch the plastic clip that holds the M2)
Components
Motherboard: B550M DS3H
BIOS version: F1 now (latest before windows reinstall I think)
CPU: AMD Ryzen 7 5800X
GPU: XFX AMD Radeon™ RX 580 GTS XXX Edition 8GB
RAMs: 32GB = 1x Kingston FURY Beast, 16GB DDR4, 3600MHz CL18 + 1x Corsair Vengeance RGB Pro 16GB (1x16GB), DDR4, 3600MHz, CL18, 1.35V
Power Supply: Seasonic S12II-620 Bronze 620W
Monitors: Dell 27' 4K UHD USB-C Monitor - S2722QC linked with a normal HDMI cable and LED VA AOC 23.6", FHD, 165Hz, 1ms, FreeSync2, FrameLess, HDMI, DisplayPort, Pivot, C24G2U/BK linked with a normal DisplayPort cable
Symptoms:
In the night of 11 September I had the first black screen on both monitors.
It was just before going to bed so I didn't pay too much attention to it. Next day I started the PC and it worked. Next few days this repeated - one black screen per day - restart would fix it (didn't pay attention to the GPU light if it was red or blue).
Days go on and the GPU started shutting off more frequently up to being able to have the PC on for about 30-40minutes until a black screen hit - and a restart would not fix it as the GPU light was staying red - so I had to reseat it and interchange the PCie power supply cords to try and get the blue light again (aka use the PC).
The AOC would not show any error, the Dell would always show "no HDMI found, going into sleep mode".
I tried to
- Upgrade to Windows 11
- Downgrade to 19045.4842 and 13.11.1
- Reinstall Windows (only the C partition)
- Interchange 6pins and 2pin cables - by using the pair which I never used I managed to run the PC for about 3.5hrs now without a black screen (most one recently)
- Reseat the GPU
- Reseat the PCIe cables
- Try to place the M2 into both slot
- Reseat the motherboard battery and rams
- Open and clean the GPU: change the thermal paste and thermal pads.
My own investigation results
In the Event Log I always find this error after a black screen & restart "SCEP Certificate enrollment initialization for local system". I read about this error on this forum and on the Microsoft forum as well, that's why the Windows update was my main suspect. After trying all the Windows options (downgrade, upgrade, reinstall) and finding out that it didn't solve anything I don't know what to think about this anymore.
Also I suspect there's something weird going on with the display driver "amdwddmg" because sometimes it also appears in Event Log as "it stopped responding and has successfully recovered". I didn't go on on this path further.
I suspect that the main issue is with the power PCIe cords, but how would I test this hypothesis?
Potential solution (for now)
GPU uses 6+2 pins (I have 2x2 pins and 2x6 pins) and I noticed that interchanging these cables can reduce the bug happening. I'm currently using the pair which I never used before and the system didn't get any black screen for 3.5hours (longest time in the last few days).