cancel
Showing results for 
Search instead for 
Did you mean: 

Server Gurus Discussions

soulreaperli
Journeyman III

Need help with hardware errors on server

Hello everyone

I am facing an issue with my server after installing a Linux operating system (Ubuntu or CentOS). I have provided the hardware configuration details below:

- CPU: EPYC 7R32

- Motherboard: MZ72-HB0

- Memory: DDR4 32GB ECC 2666

- SSD: Samsung PM9A12TBm2

- Cooling: AMD SP3 custom cooling

- Power Supply: Haiyun 1000W

However, I am encountering intermittent hardware errors on my server. Here are some of the error messages that appear in the system logs:

```

Message from syslogd@server3 at May 31 10:49:44 ... kernel:[Hardware Error]: Corrected error, no action required.

Message from syslogd@server3 at May 31 10:49:44 ... kernel:[Hardware Error]: CPU:0 (17:31:0) MC27_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b

Message from syslogd@server3 at May 31 10:49:44 ... kernel:[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2

Message from syslogd@server3 at May 31 10:49:44 ... kernel:[Hardware Error]: Power, Interrupts, etc. Error: Error on GMI link.

Message from syslogd@server3 at May 31 10:49:44 ... kernel:[Hardware Error]: cache level: L3/GEN, mem/io: IO, mem-tx: GEN, part-proc: SRC (no timeout) ...

```

I would appreciate any assistance or insights into what might be causing these errors and how to resolve them. I am unsure about the nature of the problem and the necessary steps to fix it. Your guidance would be greatly appreciated. Thank you in advance for your help. Best regards

0 Likes
0 Replies