AnsweredAssumed Answered

Threadripper 3970X and Linux hardware error

Question asked by oddskancke on Jan 18, 2020
Latest reply on Jan 25, 2020 by horen

I just bought a Threadripper 3970X together with ASRock TRX40 Taichi motherboard. This system is my upgrade from a Threadripper 1950X, which has served me very well. I run Fedora 31 on both systems.

 

During attempts to get GPU passthrough working on the new system, I have received the following Linux kernel message: (image attached)

 

feb. 15 22:26:48 oddstr kernel: mce: [Hardware Error]: Machine check events logged
feb. 15 22:26:48 oddstr kernel: [Hardware Error]: Deferred error, no action required.
feb. 15 22:26:48 oddstr kernel: [Hardware Error]: CPU:2 (17:31:0) MC22_STATUS[-|-|MiscV|-|-|-|SyndV|Deferred|-|-]: 0x982010000001010b
feb. 15 22:26:48 oddstr kernel: [Hardware Error]: IPID: 0x0000001813d17000, Syndrome: 0x000000004b00000c
feb. 15 22:26:48 oddstr kernel: [Hardware Error]: Northbridge IO Unit Ext. Error Code: 1, PCIE error.
feb. 15 22:26:48 oddstr kernel: [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: GEN

 

I seem unable to find information on what this actually means. Does anyone have enough insight into the MCA_XX registers and can tell me whether my 3970X is broken? If not, suggestions on where I can ask is also welcome

 

Best regards,

Odd Skancke

Attachments

Outcomes