So after months of troubleshooting and RMAs, I am officially at a loss of what to do. Ever since I built this machine last August I have had random BSODs and hard resets a few times a week at random use cases, middle of gaming as well as idle or just browsing the web. The BSODs have random bugcheck codes, never identifying an actual driver, always ntoskrnl.exe, and the resets that are interspersed always fail to create a minidump. I think I can recreate the crashes instead of waiting for days at a time as the system always hard resets around 8 minutes into running with driver verifier turned on along with random bugcheck codes in event viewer, again no dumps as it fails to create the dump files. Here's my system config:
• CPU: Ryzen 5 3600
• Motherboard: MSI MPG x570 Gaming Edge WiFi
• GPU: Nvidia GTX 970
• RAM: Corsair Vengeance LPX 16 GB (2x8) 3600 MHz C18
• CPU Cooler: Cooler Master Hyper 212 Evo Black
• PSU: EVGA SuperNOVA 750 G+ 80+ Gold
• No peripherals except for keyboard and mouse, which I have tried multiple of
And here is a list of troubleshooting and RMAs I already have tried:
• General driver debugging
• RAM memtest for 12 hours, 0 errors
• Tried another set of Corsair Vengeance LPX memory at 2133 MHz
• Both XMP and non-XMP
• No overclocks
• Tried different GPU (GTX 660)
• Replaced ssd with new clean nvme ssd with brand new windows install
• Replaced all power cables
• RMA'd CPU
• RMA'd Motherboard
• RMA'd PSU
• Tried earlier BIOS version
• Tried manual RAM timings
• Tried different power plans and various Windows settings
None of these changes stopped the system from hard resetting around 8 minutes into running with driver verifier on. Which kinda baffles me since I pretty much have replaced the entire system at this point except for trying out a new brand of RAM sticks or switching out the CPU cooler. I have no idea where to go from here, so any help would be greatly appreciated. Thanks guys.
Can you actually post a picture of it assembled. also I know you probably did but check your motherboard manual and make sure you have the ram in the right slots. Have you checked your infinity fabric ratio? Also i dont like how the hyper 212 evo's heat pipes have no plate under them and come in contact directly with the cpu. Ryzen 3 has its cores offset from the center and you maybe only have two heat pipes in direct contact over the module. I dunno
First can you post all the BSOD error (0x???) you are getting?
Second for troubleshooting purposes only, try to run your Windows in a "CLEAN WINDOWS DESKTOP". IF you get no more BSODs this indicates a 3rd party Startup or Driver in conflict or being incompatible with Windows.
Here how to boot into a Clean Windows desktop: https://support.microsoft.com/en-us/help/929135/how-to-perform-a-clean-boot-in-windows
It is very easy to do and undo.
Yeah sure, posted a couple pictures of the build below. One thing to note is that I went out and got an Asus Tuf x570 board to fully make sure it wasn't the motherboards fault. The system still crashes around 8 minutes in to running with driver verifier. And no I have not checked the infinity fabric ratio, not sure what exactly it is supposed to be.
And yeah the BSODs I have gotten have been errors such as IRQL_NOT_LESS_OR_EQUAL, KMODE_EXCEPTION_NOT_HANDLED, TIMER_OR_DPC_INVALID, DPC_WATCHDOG_VIOLATION, and
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED. Lately the SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (0x7e) error seems to be the most frequent in driver verifier. None of them reference a specific driver. They always just say it's ntoskrnl.exe.
And I just put in a clean NVME SSD and loaded it with a brand new clean install of Windows. Installed all Windows updates, AMD chipset drivers, Nvidia drivers, and Intel Wifi drivers, but the system still crashes around 8 minutes into driver verifier.
Thank you guys for the suggestions. I appreciate all the help I can get.
From what I have read about that last BSOD error it can be caused by an out-date driver.
This Tech website give a few troubleshooting tips on how to resolve your BSOD error: System Thread Exception Not Handled in Windows 10 [FIXED]
Go to Windows Event Viewer under Errors and see if you can identify a driver that is causing the problems.
Also check Reliability Monitor. this how to access that feature:
Go into Control Panel/Security and Maintenance. Under Maintenance click View reliability history. This will bring up the Reliability Monitor.
Also run DXDIAG.exe and save it to TXT file and upload it. That will show what files are having problems in your computer including Windows files.
Running a Clean Windows desktop might help in finding which program startup or driver is causing the BSODs.
So something I found was that when I run driver verifier but do not target the nvlddmkm.sys driver, the system does not crash. But when it's even the only driver I verify, the system crashes around 8 minutes in. So I'm not certain, but it seems like it could have something to do with the Nvidia drivers. However, I just tried cleanly uninstalling the current driver with DDU and installing an older driver from like September 2019, but that still results in the same crash. I then tried my brother's GTX 1060 to see if it would work with a newer GPU, but it still has the same crash around 8 minutes in.
Also, Reliability Monitor didn't really help, it just listed all of the kernel power failures that I already saw in Event Viewer. And the Event Viewer doesn't really show anything unusual before the crashes.
So does anyone know if there are current incompatibilities between Nvidia Drivers and the current AMD chipset or anything? That's the only thing I can think of right now.
I have installed a Nvidia GTX 1070 GPU card with the latest Nvidia driver on a AMD Motherboard 990FX (AM3+) without that type of issues. Other Users with newer Ryzen Motherboards also have Nvidia GPUs installed without that problem.
I can't see how a AMD Chipset drivers will have a conflict with a graphics driver.
It is possible another driver on your computer is incompatible with Nvidia graphics driver.
What does DXDIAG.txt show? Can you upload that file to your reply? That deals with anything to do with Graphics and other files. It is sometimes quite useful. It may indicate something you missed in Event Viewer.
By the way, RAM incompatibilities won't show up in MEMTEST86. All that will prove is that the RAM physically is not defective but not incompatible. Is your RAM listed for your Motheboard's QVL list by any chance?: Support For MPG X570 GAMING EDGE WIFI | Motherboard - The world leader in motherboard design | MSI G...
Sometimes the RAM's Timings and Speed may be causing problems. Is your BIOS set at "Factory Default"? if not, then do a CLEAR CMOS (follow manual) to put the BIOS in Factory default.
EDIT: I keep mentioning booting into a Clean Windows Desktop. Once you disable all 3rd party Startups except Nvidia Driver if it shows up and you have no BSODs, then you can, by process of elimination, start "enabling" some of the 3rd party Startups until it starts to cause BSOD in your computer.
This won't harm your computer and is a very good easy way to find out if a program or driver is incompatible.
By any chance is the PCIe x 16 slot you have your GPU installed enabled for PCIe 4.0 or PCIe 3.0 in BIOS?
If it is set on PCIe 4.0 that might be the reason for your Nvidia GPU card driver having issues: Specification for MPG X570 GAMING EDGE WIFI | Motherboard - The world leader in motherboard design |...
Thanks guys for all the help. After wanting to take a break from troubleshooting this, I connected up my old HP prebuilt with some proprietary HP intel motherboard and a GTX 660 in it to game a little. Out of curiosity, I ran driver verifier on that machine as well and again only selected the nvlddmkm.sys driver to see if it would have the same behavior as my custom built machine. And wouldn't you know it, the prebuilt also crashes around 8 minutes in. Completely different machine has the same driver verifier crashes with the nvidia driver. So at this point I was stunned and asked my brother, who has his own Intel machine with a GTX 1060, to run the same verifier test, and once again his system also crashes around 8 minutes in. So it seems like this verifier test that I have been using to see if my random resets and BSODs are gone wouldn't have worked no matter what I did, it must just be some weird thing with Windows and this driver that don't like each other? For anyone else who wants to try the same test, I ran driver verifier with all settings selected except the DDI compliance checking and only selected the nvlddmkm.sys driver. I guess I will just have to actually use my machine normally for a few weeks to see if the random BSODs are gone now. I will let you know if anything pops up guys, thanks so much again for the help.
And yeah I had tried running the PCIe slot in Gen 3 mode, disabled all the non-Microsoft services and all the startups through the Clean Windows Desktop, and set the whole BIOs to default. None of those helped, although as per above, I don't think anything would have fixed this verifier crash.
Sounds like a bug either in Windows or Nvidia Driver causing it to crash in Driver verifier.
Unfortunately, it mislead me completely in giving you troubleshooting suggestions.
Maybe if you have time you might want to notify both Microsoft Support and Nvidia Support about what you have found out.
I don't know if what you found can be replicated by Microsoft or Nvidia, but if it is a common bug than any Users with Nvidia drivers using Driver Verifier will be fooled into thinking the same thing you did.
Have a nice day and take care.
I know I'm replying to an old post, and I'm not attacking anyone in particular here, but I'm just going to leave this reply for reference in case someone thinks that running Driver Verifier will help them.
Driver Verifier is really meant for DEVELOPER USE, not end user use. Driver Verifier is designed to CAUSE BLUE SCREENS when the monitored driver performs an 'illegal operation', when that illegal operation may or may not have caused a blue screen or any other problem in normal use. (And all this info is literally right there on MS's Web page for Driver Verifier....lol)
So for us 99.9999% end users who are not driver developers and have neither the knowledge nor tools to diagnose and debug drivers, Driver Verifier does not help, and in fact makes things worse because it can lead you away from the cause of the real problem.
All those Nvidia systems from a year ago that blue screened running DV, doesn't mean those systems were all unstable, all it means is that the drivers performed an instruction or function call that MS determined to be illegal. This isn't that uncommon, a lot of software use undocumented/unsupported/'illegal' operations, often as a workaround around issues with Microsoft API's or the underlying OS itself.
TLDR - don't use Driver Verifier.
Most, if not all of these problems are occurring on MSI motherboards and I don't think anyone has mentioned updating the BIOS yet, or has it just been assumed? I updated mine and even tried a beta version released a month or so ago. To test whether this had done the trick, I put one of the Nvidia cards back in and it seemed to have worked - for a few weeks at least.
Then I developed a new problem when randomly at boot up, the fans would be abnormally loud and the boot up process would freeze. Switching off and restarting would cure it, but it got annoying and I reverted back to the old Radeon card, which worked but leaves me in limbo again.
I've given up trying now and will probably go for the Ryzen 5 5600G
I made bios update to the latest stable version, didn't help much. Right now I am installing drivers one by one on the clean windows system, and letting it sit for around 24 hours between each. So far no BSODs, and only Radeon GPU drivers are left not installed. Will be fun if there won't be any BSODs after I will install them
Ok, just wanted to add that I reinstalled fresh windows copy on another NVMe drive and bsods are gone. So it was either something messed up in drivers or faulty disk drive. Will probably test an old one in nearest future and reply here
I have an update. It seems to be that BSODs are windows update related. Removal of KB5005565 fixed the issue for me and PC was fine for 2 weeks with updates paused, and right after it was auto updated the issues re-appeared. Then I rolled the update back and now everything is fine again
This is sounding depressingly similar to what I was suffering with the MSI x470 gaming plus and Ryzen 5 3600 combo PC I recently built. It seems to be definitely graphics card related and I was on the verge of an RMA or two when I tried a last desperate move of putting an old Radeon HD 5770 (1 GB) in it.
No more random crashes/shutdowns just a working PC at last after it refused to behave with a GTX 660 TI, a GTX 760 and another lesser Nvidia card I had in my spares. I'm not a gamer and had no reason to build such a powerful PC, apart from wanting to and having a bit of spare cash for once. So while i'm waiting for the current crazy prices and availabilty of graphic cards to settle down, this is an acceptable limbo situation for me at the moment.
Obviously using an old card won't suit gamers and I envisage a whole lot of new pain when I do get round to buying a more modern one. My best option is to swap the CPU for a Ryzen 5 3400G, which is what I was going to do till Curry's (and other dealers) slapped a £40 increase on the G CPU's.
I should also mention I'm an IT tech and have built dozens, possibly hundreds of PC's over the last 30 years, so I've checked the obvious and not so obvious things many times. I'm also a long term Linux user but decided to put Windows 10 on first just out of curiosity more than anything, but quickly discarded it when the problems started happening. Currently running MX Linux with no problems. I can't even sell it with a clear conscience, so I'm looking for a long term fix too, but I'm hoping the 3400G will do that and I can sell it as a light gamer.
Struggling with the same issues myself. Random BSODs occur every few hours. That is my wife PC, but she wasn't using it much as still has all her stuff on an old machine, never had time to move to the new one completely, but it was connected to the same monitors via secondary input and used mostly to run some VMs. The most annoying thing is those BSODs can't be replicated, I just have to turn it on and wait for them to happen. The system was running fine for few month, at least she hasn't noticed any issues, and probably all started after another windows update (But could be that those BSODs were happening before that, just rarely so none notices as PC was barely used ). So far tried: Memtest - all good; BIOS update to the latest version; GPU swap from 7970 to 00% working 7950; PSU swap, tried different GPU/chipset/other stuff drivers, disabled VMware virtual network adapters. Tried booting from Ubuntu live CD - had some kernel crashes with 7970 for some reason but after swapping to 7950 it was working just fine, at least for few hours. But, when booted back to windows got BSOD again overnight, so probably not related.
Will try clean windows install on spare SSD and will also check the PCI-e bios setting.
Ok, just got BSOD 5 mins after clean windows install on a new SSD. Haven't even managed to download AMD chipset and GPU drivers. The hunt continues...
Current specs btw:
Ryzen 5 3600
MSI X570-A PRO (Bios: 7C37cHE)
2x 16GB Corsar CMK16GX4M1A2666C16 RAM sticks
Windows 10 Pro (Build 19042)
ADATA SX8200 PRO 1TB SSD
Toshiba HDWE140 4TB HDD
Ok, got just one BSOD right after windows reinstall, afterwards - all good for 24 hours (clean system, no vendor drivers, just the ones auto installed from windows update). Will now install X570 chipset drivers and let it sit for another day...
Quick update - I swaped the CPUs between my PC (was working without any issues) and my wife's, and her BSODs are gone, but now I have them.