@crayraven thanks for the tips, It's possible we might be on to something with the memory. It's also strange how a lot of us have G.Skill.
@koguma I could of sworn my RAM was on the QVL for my board but when I noticed it wasn't on the QVL of G.Skill's website for my board I double checked. It was only on the QVL for the 3000 series not the 5000 series.
It's now on the QVL on G.Skills website for my MOBO again. I find it very strange the 5000 series is having more trouble with RAM than the 3000.
Maybe I made a mistake I wish I'd documented it. Maybe ASUS removed it after there were issues reported.
So yeah I might need to mess with RAM timings. However what's throwing me off is it's acting like a CPU or PSU fault, not getting any blue screens etc.
I guess Ryzen is different with Infinity Fabric and picky with RAM, many say it is, but does that cause what we are seeing?
I also found some F4-3200C14D-16GFX for $200 locally it's just come back into stock, the only 3200Mhz CL14 memory I could find. I could go for 3600Mhz on the QVL but that would require a lot of search and CPU still might not like it. Anyway thinking of getting this just as a test to rule out RAM issues. However so far it's not crashing with my current tweaks, so need to work out if it still crashes before I make a change.
Awesome analysis ZiN
After playing around yesterday with power option/balance/processor management to 10 and xmp to 3000mhz have made my 5600x stable,bsod when set xmp to 3200mhz which I dont understand, maybe it is cpu/memory related to the bios/ryzen 5000s architecture
I think Zen 3 is very picky when it comes to memory. When I first upgraded my system from a Ryzen 1800x and x370 platform back in June I used a 16x4gb corsair dominator kit. While it had been very stable for the past 4 years with my old x370 platform it was not with my 5950x and x570.
I got whea 18s almost immediately, even with cmd 2t. Which I thought was weird, since it worked with my old platform just fine. Because of the whea 18s I decided to get a g.skill 64gb kit. Because I thought it was just a ram incompatibility with the new platform. When I installed the kit everything seemed fine until I got the whea 18 bsod again and again. So I really started to think it was voltage issues with the cpu when its idle or doing light load tasks.
I now believe its a RAM issue. I don't believe any of the cpus RMAed were faulty. Maybe a few of them, but not as many as we think. I think the main issue is that the xmp profile doesn't set 2T +gear down disabled when needed, when it should have. This causes people to believe something is wrong with their system. And eventually think its the cpu broken. Its surprising, out of all the threads I read no one suggested that it could be the ram set to a cmd rate that it couldn't run stable at. I mean, I didn't even think that until I started to put all the facts together to form a picture.
So far I've had my pc on for 24 hours straight. Not a single whea 18. It feels great. I hope this solution actually works for everyone. I know I did NOT want to RMA my cpu or board. I'm sure most did not.
What have you done exactly to the ram setting to have it stable for over 24hrs?
In your dram settings, disable gear down mode and change cmd to 2T instead of keeping it on auto.
If both are on auto then gear down mode will be enabled by default and cmd will be set to 1t by default. My view on what is happening is that sticks like mine put a lot of strain on Ryzen mem controller so it can't handle 1t or 1.5t (that is 1t+gear down mode enabled).
I will be RMAing my CPU an go back to intel/10850k i cant keep on going with setting changes every day for the past 3 months.
@crayraven careful not to assume we've sorted it until we've proven it but we might be on to something.
My second CPU clearly died but it was working perfect before that, it also was easy to determine once it had fully gone.
The instability is a bit more unknown.
The 5000 series was supposed to handle faster memory
Please let us know if you're still good in a week or two then let us know if you're still good in a few months, these things tend to come back.
This may not solve everyones issues but It'd be interesting to see how many people this helps. It'd be kind of similar to other methods people use but safer than boosting voltages.
this is just my experience. FMMV !!
There's something about AMD chips that did not change over time. i have used AMD on and off a long time. Just like last ones, For my Ryzen 5000 (i have 5950x with x570 Elite GB MB), having memory setting "correct" was key to success of having a stable system. I am not a heavy overclocker, nor gamer or anything but do need a reliable system for my work. for this setup, I've started with 2 sticks of DIMMs (2x 32GB). I got what i need to do and got to a stable place. Then i added 2 more (2x 16GB) making total of 96GB. (side note: i do have another 2x32GB but they are different timing chips and i just could not get them play nice with other 2). That's when some issues started to show up.
At first, problem was random reboots (mostly during idle) and there was one bench mark that i knew it would make the system reboot. All other testing etc passed fine. To resolve this last issue, all i had to do was to push vsoc to 1.2v & vsoc LLC to "medium". RAM timing etc were all left to manufacture's stated values for 1800MHZ (so DDR 3600 settings). for good measure, i have VDDG CCD 1015, VDDG IOD 1045. with these, i have no more issues and it's been that way for months. Since i have the system stable, i bumped fabric/mem clock to 1833 to be a bit faster with no other changes in voltage etc. Even with that bump, the system is having no issues at all. I also have all core CO at -15 (too lazy to get that all fixed for 16 cores). On memory front, since i have just about all at default, gear mode is enabled with 1T. BGS is disabled, BTW.
Yes, this is not the fastest memory but i need more memory than less of faster memory for my needs. And yes, i have no WHEA or any other issues.
After a couple days I finally got a bsod. Here is the interesting thing, its a Kernel 41 + system service exception tcpip.sys bsod. Not a whea 18. I have no idea what that means.