Hello all, I hope this is the right place.
After building my own machines since the eighties, I had Micro Center build me an AMD-based Windows 11 system to replace my previous 10-year-old workstation that I use for software development. I got it in June and have been struggling with it for months.
TL;DR: I'm fairly confident that something in the graphics system is causing my machine to wig out every other day or so, and I strongly believe it's related to the AMD drivers.
System specs below, but I have 2x AMD W6600 Pro Workstation graphics cards.
It feels like it's the window manager: when The Thing Happens(tm), windows become increasingly unresponsive: some apps I can close right away, others when I click the red [X] they start closing but freeze, and then I can't interact with that app any more, even to close it.
In Windows Explorer, I can bring up the Start Menu, and clicking the power icon showed that it toggles, but doesn't do anything: it's like window messages are getting lost.
Task manager won't come up (control-shift-escape)
Eventually the keyboard/mouse become entirely unresponsive, and I have to do a hard reset.
About 2 months ago, a Micro Center tech recommended the Display Driver Uninstaller, after which I installed the AMD driver-only package, rather than the full Adrenaline thing that has all the flashy bits.
Still wigs out every other day or so, but with different enough symptoms that it very much points to an AMD driver issue and not a generic hardware / windows thing.
Now I'm able to actually request a shutdown of windows, but it hangs indefinitely showing "Restarting...". Once I let this sit overnight and had to hard reset it in the morning. It's never completed a restart on its own when The Thing Happens.
I have found no correlation to The Thing Happening with any behavior on my part, and a few times it's happened while I wasn't touching the keyboard or mouse. I cannot reproduce this on demand, it just happens when it happens.
Bizarrely, once while on on a Zoom call, the video froze (and all the other wiggy symptoms), but I was able to talk with my colleagues on my USB-based headset until I hard-reset the machine.
As a side issue, about once a day the audio will do the Brrzzzt stuttering thing for a half a second, unrelated to anything I can identify. The machine is never taxed on any metric (CPU or memory), and a 16-core machine shouldn't be doing this. Maybe unrelated, I can live with it.
System specs:
- AMD Ryzen 9 5950X CPU (16 core, 3.4GHz, AM4 socket)
- ASUS X570 Prime motherboard
- 128G RAM
- 2x 2TB Samsung 890 PRO NVME SSDs
- 2x AMD Radeon PRO W6600 workstation cards (4 monitors each)
- Corsair all-in-one liquid cooler
- Fractal Designs case
- Windows 11 Pro
- EVGA 850 GA power supply
- 4x Viewsonic monitors (will soon go to six)
- 3x VX2778, 2560 x 1440 @ 60Hz, 8 bit color
- 1x VX3211-2K, 2560 x 1440 @ 60 Hz, 10 bit color
I use this as a pro workstation; mostly software development, some 3D modeling with Fusion 360, the usual set of business apps (QuickBooks, MS Office, etc.).
This machine is much bigger than I need, but I want something to last me a long time.
My use cases:
- no gaming
- no crypto-mining
- no overclocking or BIOS performance tweaking
- no video editing/rendering
- no hacked/cracked software
- basically the same software in my old desktop
- running as a non-admin user (I type my admin pass to elevate)
The machine is on a good UPS, and we have generally very clean power here at home.
Micro Center - whose flagship Tustin store is walking distance from my house - did an excellent job helping me spec the components and putting it all together: it's really a work of art (though no LED bling for me).
TROUBLESHOOTING STEPS TAKEN:
- I've swapped around the graphics cards, including running one at a time. Problem is the same in any combination of one or two cards. I don't believe the W6600 cards are defective.
- I temporarily moved the AMD Radeon Pro WX4100 graphics card (also 4 heads) from my old Windows 10 system. No change.
- Note: when I swapped hardware, I always wore an anti-static strap; I also do electronics at my desk and have a decent grounding system.
- Replaced the USB hub in case that was dorking keyboard/mouse: different brand. No change.
- One of my monitors was a little wiggy on my old system, so I replaced it to rule that out. No change.
- All software (Windows, AMD, ASUS) have been consistently updated.
- Exhaustive looking through event logs shows nothing actionable.
- I bought a pair of older two-head PNY NVS 310 cards (NVidia) to see if taking the AMD drivers out of the loop, but turns out they don't boot Windows 11. Duh.
- I talked w/ Micro Center, who was helpful, but they didn't sell the Radeon cards and can't really do much to support them without me leaving the machine with them for a while.
- I've just started leaving the Ryzen Master software open all the time so that if it wigs out, maybe I'll see something useful. I've never seen the system really taxed no matter what I'm doing.
WHAT I'M TRYING TO AVOID:
- Leaving the machine with Micro Center for troubleshooting; I have a large amount of customer confidential information on this machine, and I'd be without my main work machine for days (I am a self-employed software consultant)
- Reinstalling Windows. This will be a multi-day ordeal to get everything re-configured again, though of course if this will solve the problem I'll go through it.
- Buying some alternate video card(s). I have had bad luck w/ NVidia in the past, and in any case I'm way beyond the return period from Newegg, so I'd be stuck with $1500 in video cards I can't use.
- Putting up with this indefinitely.
- Throwing myself under a bus to end it all
I intentionally and enthusiastically went with an all-AMD system, and when it's working it's *really* nice (fast, smooth) but this every-other-day wigging out is really painful and tiring. It's done this since ~day two.
I also have decades of experience with Windows system-level software development (services, windows print drivers, communications controllers, etc.) so have decent tech chops, but don't know where to look for this.
Any suggestions for how to troubleshoot this further? This whole thing makes me want to cry.
Thank you ~~~ Steve