0 Replies Latest reply on May 26, 2018 2:51 PM by skyfi

    Unknown system crash

    skyfi

      This one has a backstory, so please allow me to lay out the scenario.

       

      I started off with a FX6100 on a ASRock 970Extreme3 Rev1.0 motherboard, and a RX470 for GPU on win7-64. I was using the chipset drivers listed on the website for the motherboard, and got the RX470 in November of 2016 and used whatever the current drivers were at that time. All was well and worked just fine without any issues whatsoever.

       

      Last year (2017), when Ashes of the Singularity: Escalation came out with the Vulkan API support and required a new release of the GPU drivers, I got those and they worked just fine.

       

      A month or two went by and I wondered if the Vulkan drivers had improved any, and saw there was an update for the entire GPU driver package, so I did a 'clean install' option as I always do to upgrade those. I don't recall the build numbers off the top of my head and have already long-since purged old unpacked installers from C:\amd.

       

      This is where the problems began. I would go to bed and wake up in the morning and nudge the mouse as I always do, but nothing would happen. Press keys on the keyboard, nothing happens. Num Lock is still lit up, so I press the numlock button and it doesn't change. Unplug the displayport cable for the monitor and plug it back in, nothing changes.

       

      Here's where it gets interesting: if I move over to my laptop and pull up cmd, I can ping the desktop computer just fine. The mapped network drive is still connected *and* I can browse it and read and write to it. I can even browse other network shares that are not mapped drives. Some part of the kernel/core of the system is still fine, but there's no way to interact with it.

       

      I tried using Remote Desktop and it connects, asks for username and password, authorizes, and begins starting the RDP session, but never does anything once it gets to "loading Desktop". I've even tried using cmd to issue a 'shutdown' command across the network. The laptop says that the command was completed successfully, but the problem machine never acknowledges it, as per checking the event logs later on.

       

      Pressing the power button (which is set to shut down, not sleep) does nothing, either, so it leaves me with pressing reset.

       

      Once the system boots back up, I go into event viewer and there is absolutely nothing in the event logs that would indicate something had crashed or stopped working. And this weird behavior doesn't happen all the time--it seems to be random. Sometimes I can go 3 days of being idle and bumping the mouse turns the monitor back on and gives me a desktop, other times, 5 seconds after the display goes to sleep... it happens.

       

      The best way I can describe this is in Linux terms. It's like Xorg crashes once the display goes to sleep, and the human-input layer *also* crashes with it. But the kernel and the core services continue operating just fine.

       

      But here's where it gets interesting: I dealt with this for about 3 months and actually just decided to set the display to never go to sleep and would turn the monitor off manually and that worked. I got tired of doing this after a while and wondered if a fix had come about, so I checked for a new driver and 17.12.1 was out. I did a clean install of that and cautiously set the display to go to sleep after 5 minutes like usual again, and it was fine for six months without issues.

       

       

       

      Now we get to the interesting part of this problem.

      I just got Ryzen 2700X on a Gigabyte Aorus Gaming 7 board and used the chipset drivers from the Gigabyte website for win7, and used the current/latest GPU drivers for 7 as well. It all looked fine and worked great...for about 12 hours throughout the day of doing installations and updates and getting everything set up with the clean install back to how I like it all. Then I went to bed, and woke up in the morning.. nudged the mouse.. no response. It all happened again.

       

      So I dug out the old installer for 17.12.1 from the Macrium image of the previous install and did a clean install of that and it worked great.......for about 5 days, and then did it to me three times in the course of an hour. I thought 17.12.1 would be a good one since that worked on the Bulldozer build for ~6 months without issues, but it still misbehaves.

       

      These issues did not happen *before* Vulkan, and I'm not at all blaming Vulkan--I'm merely using it as a historical reference point. Something about the drivers *after* Vulkan became hit-and-miss and doesn't seem to like win7 for me, across two clean installs, on two different hardware platforms. So I know it isn't a worn-out 5-year-old motherboard, and I know it isn't a 4-year-old win7 install that has seen a lot of things along the way.

       

      So the common variables are: win7, drivers for rx470. One of those two things, or a combination of both, has been acting weird since Vulkan came out, and it has been hit-and-miss. I have no idea where to even begin trying to pinpoint it, or track down what specifically the problem is.. because as I said earlier, one driver release would do it, the next is fine, the next one will do it, the one after that is fine. And now.. one that I thought was fine, is doing it, so I'm just at a loss now.. no idea. There's nothing in the Event Logs that show anything crashed or failed or stopped working.. it's all A-OK in there, right up until the messages about how the system recovered from an unexpected shutdown, which is when I pressed the reset button.

       

      I would really like to keep using 7... but it is starting to seem like the drivers are being made in a way that are becoming less and less stable on 7. That's really unfortunate.. 7 just works so well, and you can never really truly get 10 to stop spying on you. Yes, I know 7 is old, but A) Gigabyte has drivers for 7 for this board. B) AMD's chipset drivers specifically have 7 listed for support. So.. even though MS would rather I use 10 with this new hardware, 7 should still be fine.

       

       

      So if anyone has any ideas other than "7 is so old, lol, stop using that antique OS", I can provide more info for specific questions and troubleshooting steps.

       

      edit: Also wanted to add that I'm not talking about the system going to sleep and not waking up--I don't let the system sleep. It is just that the display is set to turn off and that's it.