I seem to have instability on a system Ubuntu 18.04 on EPYC 7351P
The MOBO is SuperMicro H11SSL.
I have installed the latest GA Ubuntu 18.04.
Even though many boards were saying the EPYC is not vulnerable to the same defects as the Ryzen, I am experiencing unstabilities.
The server, when left idle, will log strange messages.
Yesterday, the computer progressively crashed because I tried actions from apt and apt-get.
Let me ask again: are the EPYC processors vulnerable to the weakness that seem to affect the Ryzen? Can someone speak for this case?
Thanks a lot, hj
================Update 2019-09===================
Let me put this in context.
First I thank all the commenters that took the time to read this and ask for more questions. It is very appreciated.
Second currently the problem is solved / circumvented, in that it does not reproduce anymore.
In my analysis, I spotted first a problem logging a
Day 1 00:50:27 hostname1 kernel: [2099837.618882] general protection fault: 0000 [#1] SMP NOPTI
that you can find in the attached paste1.log.
Then pursuing the analysis, I had regular crashes related to the execution of fstrim on a SSD disk, which is a known issue, see for instance
systemd - nvme fstrim causing crash on linux, disabling with systemctl doesn't help - Unix & Linux S...
that would block the kernel in an uninterruptible state. See the attached fstrim log file.
I circumvented the problem by deactivating the regular fstrim. I know it is not ideal, but it was too difficult for me to investigate.
Since it did not reproduce since then, I guess that was the culprit.
So this has nothing to do with AMD or Ryzen, as I could see the issue happened also on Intel processors.
I have not recently checked on the web if the issue is resolved in recent kernels.
To conclude, I believe we can close the thread.
Thank you again for your suggestions and comments.
hjohanns