have you tried reducing the memory timing day DDR3-2133 and see if that is more stable
try an even higher CL if need be
Disabling just C6 (not the global C-states completely) seems to have "solved" it for me. At least I haven't seen this issue in two weeks. This has nothing to do with RAM stability it seems, just a glitchy platform at the moment.
Another crash. I'm still in RMA process (I guess for 3 months or so already). About a month ago AMD support promised the CPU replacement would be ready in early November and they contact me. Haven't heard anything from them since then.
nov 11 09:14:49 ryzen kernel: mce: [Hardware Error]: CPU 10: Machine Check: 0 Bank 0: baa0000000060185
nov 11 09:14:49 ryzen kernel: mce: [Hardware Error]: TSC 0 MISC d012000101000000 SYND 2d030000 IPID b000000000
nov 11 09:14:49 ryzen kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1510391667 SOCKET 0 APIC a microcode 8001129
Has anyone tried
processor.max_cstate=5 at boot to see if it is a suitable workaround? I have a x370pro which only has global c-state disable, no individual disable.
Well, I have a long time of system building (especially with AMD) and really the parts should work as long as the ram/cpu is in the mobo compatibility list, psu has enough watts, etc.
There should be no forced reboots, freezing etc. and all these crap things such as try this little timing change etc shouldn't be necessary with good parts unless OCing or some other non-default situation.
I have the problem too and have no need for a shop. The only difference is I have no spare Ryzen parts and would have to buy different brands of those (where possible) to find out which one the problem is coming from or try to RMA each part until it works (and there's no guarantee that will work).
Its a shame that this seems to have no known actual cause or a complete actual cause that people are still suggesting things, however I still have yet to go through all 11 pages and it also takes an unknown amount of time to test each change.
Some of this reply is to the 1st and 2nd pages, but also want to voice my opinion on it. Its kind of a crappy problem when you're the one with it.
... read back one page and see people have already tried replacing parts, RMAing, etc. so that's fun that it wasn't "fixed" except for what seems to be power management lol. Total garbage. Watch the Ryzen 1st gen motherboards get no more updates too, just kidding 😃
Getting random reboots and freezes evey couple or a few a day sometimes while on low load, Only had pc 1 week. Can't find any solutions on the web or confirmed cases of resolution...
OS: Arch Linux
Kernel: x86_64 Linux 4.12.8-2-ARCH
CPU: AMD Ryzen 7 1700 Eight-Core @ 16x 3GHz [30.0°C]
GPU: AMD/ATI Curacao PRO [Radeon R7 370 / R9 270/370 OEM]
RAM: 2 x 16GB Gskill Flare X "AMD compatiable"
MOBO: MSI Mortar Artic matx
Let me highlight something for you 😃
This isn't a compatibility problem when it happens with memory sticks recognized as compatible with the mobo. That means the mobo manufacturer documentation is wrong or you are.
Friend, if might be everything. Such statements dont help.
I have RMAed that and got a replacement already with much headache in communication. It seems to work now. So everyone who see that error log, RMA it.