My system has a Ryzen 7 1700X and an RX480, it's about 6 and a half years old.
So I've been having issues with general system instability, things have been slowly escalating for over a year but I've been putting off dealing with it because life is busy and until the last few weeks the disruption was something I could live with.
At the peak of these issues there were a lot of different ways the system could go down, and various states it could be left in.
Over the past couple of weeks I've addressed a few of issues with the system. I've re-pasted and replaced the thermal pads on the graphics card which was running at 70c while idle and getting up to 90c while gaming - the temperatures are a lot more reasonable now, I've updated my bios, and I've made sure my drivers are up to date in the Adrenaline software.
I have one issue remaining after doing all of this.
The first symptom is always a momentary monitor outage. This isn't a flicker, there is never a burst of them, I have 3 monitors and all of them go dark, exactly once, for a brief moment.
The above then repeats very infrequently, maybe 3 or 4 times per hour, and this can carry on for anywhere between 40 minutes or 3 hours and then suddenly the whole rig ceases to output anything. All the monitors disconnect, my USB headset disconnects, however the PC remains on and after a hard reset things are usually normal.
It doesn't matter if I'm gaming or not, it can happen if the PC is almost entirely idle. This week everything was fine until yesterday evening when it happened twice, and this morning - the little micro outages are ongoing as I type this.
Looking at the Windows Event Viewer, one thing looks really odd, logs about the surprise removal of a disk - this would be expected when one of those failures happens, if my USB ports turn off (the headset disconnects remember) then that should be my USB drive, but it can specify either Disk 1 or Disk 2, and it's not 1:1 with failures, I've had 2 just this morning while the micro-outages are ongoing and only one yesterday evening - which does match up in time with one of the failures but nothing for the other. I wonder if the mix of Disk 1 vs Disk 2 is just my USB drive changing numbers or if this hints at issues between my processor and M2 drive? I've unplugged my USB drive for now and I'm going to see if those errors persist or stop entirely
Most of the graphs in the Adrenaline software look normal? Not that I really have a frame of reference for that.
The temperatures this morning are both a bit higher than they have been lately and I would wonder if that's because I put the side of the case back on yesterday but they were both lower than that yesterday evening.
The GPU memory clock speed graph is confusing? What is this showing me? Why are there 2 lines? Is my memory dipping slower? Shouldn't the speed just be the speed and thus constant? Why graph this?
I picked "processor" for this question because the USB Headset disconnecting pushes me in the direction of that being the cause, I don't see how a GPU failure could cause that. These forums have the same issue as AMD's bug report form which I was seeing a lot of before I repaired the GPU and flashed the BIOS where everyone is expected to be an expert on their own problems and "I don't know" isn't an answer, so sorry if this is in the wrong place.
I'm out of diagnostic tools that I know about, and I'd least like to narrow this down to a single hardware component so I know what to replace.
Should I open the PC up again and re-paste the CPU? The temps looked OK to me but peaks of 73c while idle seems suspect.
The microoutages seem to have stopped this morning while I was typing all of this, which is new, they've always ended in a failure up until now.