This tech site give some very good tips on what causes WHEA LOGGER Error 18: Event ID 18: Microsoft-Windows-WHEA-Logger - TechNet Articles - United States (English) - TechNet Wi...
Thanks for the update and quick reply. I'll be sure to keep an eye on this thread. Looking for the same issue. Bumped into your thread. Thanks for creating it. Looking forward for solution.
@Lawrence06 nice find. It doesn't mention and error detected on the infinity fabric by the memory controller on the CPU. This could be affected by voltage or clock speeds if using DOCP or speeds over 3200 Mhz.
I hate to think it's a RAM speed issue because nobody seems to know about it, even when I ask at the computer store or contact AMD support, they never said ram was the issue. They market higher speeds yet mention nothing about them not being compatible or that it voids your warranty (which I think is BS).
Still something to rule out, but it made no difference for me.
My first CPU was unstable and slowly got more stable with updates, the latest chipset driver I tried was from March and the latest BIOS in April/June before I gave up waiting. Last WHEA error I got was April 1st.
My second CPU was perfect for 2 months, tried to reproduce the issue and couldn't. DOCP wasn't enabled the whole time. Then suddenly it became unstable and died a week later by not being able to boot an OS properly, CPU behaved the same on other PCs. Never got a WHEA error with it.
Tried a third 5800X, latest BIOS, it crashed within 24hrs and once more the next day I believe. After updating the chipset driver and setting PBO from Auto to Disabled (The PBO change should of made no difference) it's now been on for over 7 days straight no issue. It might not be solved yet, it can be so random, but maybe the chipset driver made the difference.
Either way I've had 3 different CPUs, they all behaved differently.
I Have temporary fix, by going to BIOS and changing the override CPU Voltage to 1.25v. I also disabled PBO.
but is there a final solution? by the brand
I live in Chile and here it is difficult to do RMA
Ok, same problem here (on a random APIC ID and always on smss.exe). My system:
CPU: AMD Ryzen 9 5950X.
MB: Asus Rog Strix B550-F Gaming (Wi-Fi), with the latest BIOS (August 10, 2021).
PSU: Corsair RM850x (Gold 80+, 850W, 2021).
GPU: nVidia GT710 (2GB) (waiting for my RTX, this is just a cheap toy to use Windows).
RAM: 32GB (4 x 8Gb) Corsair 3000 Mhz CL15 (XMP 2.0, BIOS enabled. This RAM is from my previous PC, which NEVER gave me any problems).
Cooler: Arctic Liquid Freezer 360.
OS: Windows 10 64-bit.
The GPU is low-end and the system reboot happens when I run a process that uses only the CPU (all 16 cores), the GPU is idle. So this isn't a GPU-related issue (at least for me).
The system has all the latest drivers and BIOS updates installed.
In addition to XMP, I have enabled a setting in the BIOS to keep all cores a bit more clocked when they are all used (the frequency remains around 4.4 - 4.5 Ghz on all 16 cores, without this option it is reduced to 3.7 Ghz. With this option enabled, at 4.5 Ghz on all cores, the maximum temperature reached is around 82°C).
I have tried CoreCycler 0.8.2 and when testing with Prime95, I always get the error: "FATAL ERROR: rounding was 0.5, expected less than 0.4", which indicates CPU instability.
I disabled the option mentioned earlier from the BIOS, and so far Prime95 works without any rounding issues. I had read that this problem is due to the CPU voltage being a little too low and therefore I believe it's a motherboard related issue not having the correct settings (in terms of core voltage for example). The latest BIOS reports: "Improved System Stability", which perhaps means a change in those settings, but they are probably not perfect yet.
However, I am disappointed with AMD. It's easy to say "my CPU can hit 4.9Ghz" when this can only be done on a single core. It's normal that when you use all the cores the frequency is lower, but 3.7 Ghz is way too low (my previous CPU, not AMD, can keep all cores at 4.3 Ghz WITHOUT a single instability event).
It would be better if AMD gave us the best settings to maximize performance while maintaining stability (core voltage, for example). Or better, I believe MB makers are struggling to find the best settings in order to get the system stable (RMA are countless), AMD should work with them to permanently fix this, because IMHO the company will quickly lose trust and reputation if things remain as they are (and after the notable effort made to create the new Ryzen architecture it would be a huge shame).
Ok, looks like mine (5950X) is become stable (until now). What I have done is:
1) Go to BIOS and 'Load Optimized Defaults' (note that CPB is Enabled by default and I leave it enabled);
2) PSS Support -> Disabled;
3) Global C-state Control -> Disabled;
4) Power Supply Idle Control -> Typical Current Idle;
5) Power Down Enabled -> Disabled (for DRAM);
6) Gear Down Mode -> Disabled (for DRAM);
7) Set XMP to Enabled for the DRAM (because my DRAMs support XMP 2.0);
I have been testing the system for 5 days (24h/day of calculations, on all 16 cores) and until now I haven't had WHEA 18. Hope this will help other people.
My 2 cents,