cancel
Showing results for 
Search instead for 
Did you mean: 

Processors

xblack
Journeyman III

Ryzen 2700x Hardware errors Linux

Hi,

I have a Ryzen 2700x on a gigabyte AURUS x470 Gaming Ultra with CMK32GX4M2B3000C15 ram on bank 1 and 2.

I currently get this error from now and then with some sporadic full system freeze that require a manual reset.

Jul 27 09:27:09 marcopc kernel: mce: [Hardware Error]: Machine check events logged
Jul 27 09:27:09 marcopc kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: 9820000000000150
Jul 27 09:27:09 marcopc kernel: mce: [Hardware Error]: TSC 0 MISC d012000100000000 SYND 2a000503 IPID 300b000000000
Jul 27 09:27:09 marcopc kernel: mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1532676424 SOCKET 0 APIC 3 microcode 8008206

They are generated especially doing cross-compilation or using the ryzen-test utility (https://github.com/suaefar/ryzen-test)

[  941.702984] mce: [Hardware Error]: Machine check events logged
[  941.702988] [Hardware Error]: Corrected error, no action required.
[  941.702994] [Hardware Error]: CPU:3 (17:8:2) MC3_STATUS[-|CE|MiscV|-|-|-|-|SyndV|-]: 0x9820000000000150
[  941.702999] [Hardware Error]: IPID: 0x000300b000000000, Syndrome: 0x000000002a000503
[  941.703002] [Hardware Error]: Decode Unit Extended Error Code: 0
[  941.703004] [Hardware Error]: Decode Unit Error: uop cache tag parity error.
[  941.703006] [Hardware Error]: cache level: RESV, tx: INSN, mem-tx: IRD
[ 1255.717410] mce: [Hardware Error]: Machine check events logged
[ 1255.717413] [Hardware Error]: Corrected error, no action required.
[ 1255.717418] [Hardware Error]: CPU:5 (17:8:2) MC3_STATUS[-|CE|MiscV|-|-|-|-|SyndV|-]: 0x9820000000000150
[ 1255.717422] [Hardware Error]: IPID: 0x000300b000000000, Syndrome: 0x000000002a000503
[ 1255.717424] [Hardware Error]: Decode Unit Extended Error Code: 0
[ 1255.717425] [Hardware Error]: Decode Unit Error: uop cache tag parity error.
[ 1255.717427] [Hardware Error]: cache level: RESV, tx: INSN, mem-tx: IRD
[ 1255.717430] mce: [Hardware Error]: Machine check events logged
[ 1255.717430] [Hardware Error]: Corrected error, no action required.
[ 1255.717432] [Hardware Error]: CPU:13 (17:8:2) MC3_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd820000000000150
[ 1255.717434] [Hardware Error]: IPID: 0x000300b000000000, Syndrome: 0x000000002a000503
[ 1255.717436] [Hardware Error]: Decode Unit Extended Error Code: 0
[ 1255.717437] [Hardware Error]: Decode Unit Error: uop cache tag parity error.
[ 1255.717438] [Hardware Error]: cache level: RESV, tx: INSN, mem-tx: IRD

Most of the time it doesn't generate particular issues and correct itself as you see while sometime one of the processes get tainted or even the pc freeze completely

The current system is ArchLinux with kernel 4.17.9-1 and cpu microcode patch_level=0x08008206.

Is anyone else having this issue?

Did a cpu change solve the issue?

I asked AMD but as result after my first email with the information above I got an RMA approval with no other info.

Thanks

Marco

0 Likes
7 Replies
vkfu
Journeyman III

I see the same errors on my Ryzen 5 2600X in an ASRock X470 Taichi. Every few hours starting about 10 minutes after it boots up I see these errors on a mostly idle machine. They first appeared a couple days ago but today I had frequent system hangs along with the errors. I swapped in a Ryzen 3 1200 several hours ago and have seen no more errors. I am using Ubuntu 18.04 with a 4.15.0-30 kernel.

0 Likes
blsqr
Journeyman III

I am also having this issue. (Ubuntu 18.04, kernel 4.15.0-34). Did you manage to find a solution?

In my case, the time of these log entries does, however, not coincide with the freezes; often they are many hours apart. Is that the same in your cases?

0 Likes
evanrinehart
Journeyman III

I just purchased ryzen 7 2700X and am having similar issues. Freeze-ups and spurious hardware errors. Specs:

MSI B450 motherboard

Linux 4.19-rc5 configured with AMD support

Latest kernel firmware package (20180913)

What seems like randomly I will get hardware errors such as

[ 6558.679190] [Hardware Error]: Corrected error, no action required.

[ 6558.679192] [Hardware Error]: CPU:13 (17:8:2) MC3_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd820000000000150

[ 6558.679194] [Hardware Error]: IPID: 0x000300b000000000, Syndrome: 0x000000002a000503

[ 6558.679196] [Hardware Error]: Decode Unit Extended Error Code: 0

[ 6558.679197] [Hardware Error]: Decode Unit Error: uop cache tag parity error.

[ 6558.679198] [Hardware Error]: cache level: RESV, tx: INSN, mem-tx: IRD

And after one night I found it frozen and in need of a reset.

Before I updated the kernel, the errors and freeze ups occurred much more frequently. I could not get through an entire kernel build without a system freeze. Now it seems to be about once a day.

Latest BIOS update, no overclocking, no special power mode settings in the BIOS, yet.

0 Likes
registrirai
Journeyman III

I have the very same problem!

/var/log/syslog:

[ 316.200910] mce: [Hardware Error]: Machine check events logged
[ 316.200913] [Hardware Error]: Corrected error, no action required.
[ 316.200918] [Hardware Error]: CPU:15 (17:8:2) MC3_STATUS[-|CE|MiscV|-|-|-|-|SyndV|-]: 0x9820000000000150
[ 316.200922] [Hardware Error]: IPID: 0x000300b000000000, Syndrome: 0x000000002a000503
[ 316.200925] [Hardware Error]: Decode Unit Extended Error Code: 0
[ 316.200926] [Hardware Error]: Decode Unit Error: uop cache tag parity error.
[ 316.200928] [Hardware Error]: cache level: RESV, tx: INSN, mem-tx: IRD

OS:

Linux 4.15.0-46-generic #49-Ubuntu SMP UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

/proc/cpuinfo:

processor : 0
vendor_id : AuthenticAMD
cpu family : 23
model : 8
model name : AMD Ryzen 7 2700X Eight-Core Processor
stepping : 2
microcode : 0x8008202
cpu MHz : 2186.927
cache size : 512 KB
physical id : 0
siblings : 16
core id : 0
cpu cores : 8
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp
lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand
lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw
_pstate sme ssbd vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm
_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
bugs : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips : 7385.13
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

anyone got fix for this?!

0 Likes
tuxuser
Journeyman III

Same problem:

HP Elitebook 745 G5

CPU   AMD Ryzen 5 PRO 2500U w/ Radeon Vega Mobile Gfx

KERNEL   5.1.0-rc6-1

Every 5 minutes:

kernel: [ 1873.781218] mce: [Hardware Error]: Machine check events logged
kernel: [ 1873.781223] [Hardware Error]: Corrected error, no action required.
kernel: [ 1873.781235] [Hardware Error]: CPU:3 (17:11:0) MC1_STATUS[Over|CE|MiscV|-|-|-|SyndV|-|-|-]: 0xd8200000000a0151
kernel: [ 1873.781243] [Hardware Error]: IPID: 0x000100b000000000, Syndrome: 0x000000004a000000
kernel: [ 1873.781250] [Hardware Error]: Instruction Fetch Unit Ext. Error Code: 10, L1 BTB Multi-Match Error.
kernel: [ 1873.781257] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
kernel: [ 1873.781265] mce: [Hardware Error]: Machine check events logged
kernel: [ 1873.781266] [Hardware Error]: Corrected error, no action required.
kernel: [ 1873.781272] [Hardware Error]: CPU:6 (17:11:0) MC1_STATUS[Over|CE|MiscV|-|-|-|SyndV|-|-|-]: 0xd8200000000a0151
kernel: [ 1873.781279] [Hardware Error]: IPID: 0x000100b000000000, Syndrome: 0x000000004a000000
kernel: [ 1873.781285] [Hardware Error]: Instruction Fetch Unit Ext. Error Code: 10, L1 BTB Multi-Match Error.
kernel: [ 1873.781290] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
kernel: [ 1873.781393] [Hardware Error]: Corrected error, no action required.
kernel: [ 1873.781408] [Hardware Error]: CPU:5 (17:11:0) MC1_STATUS[Over|CE|MiscV|-|-|-|SyndV|-|-|-]: 0xd8200000000a0151
kernel: [ 1873.781420] [Hardware Error]: IPID: 0x000100b000000000, Syndrome: 0x000000004a000000
kernel: [ 1873.781428] [Hardware Error]: Instruction Fetch Unit Ext. Error Code: 10, L1 BTB Multi-Match Error.
kernel: [ 1873.781436] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
kernel: [ 1873.781445] [Hardware Error]: Corrected error, no action required.
kernel: [ 1873.781453] [Hardware Error]: CPU:0 (17:11:0) MC1_STATUS[Over|CE|MiscV|-|-|-|SyndV|-|-|-]: 0xd8200000000a0151
kernel: [ 1873.781462] [Hardware Error]: IPID: 0x000100b000000000, Syndrome: 0x000000004a000000
kernel: [ 1873.781470] [Hardware Error]: Instruction Fetch Unit Ext. Error Code: 10, L1 BTB Multi-Match Error.
kernel: [ 1873.781477] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
kernel: [ 1873.781485] [Hardware Error]: Corrected error, no action required.
kernel: [ 1873.781492] [Hardware Error]: CPU:4 (17:11:0) MC1_STATUS[Over|CE|MiscV|-|-|-|SyndV|-|-|-]: 0xd8200000000a0151
kernel: [ 1873.781502] [Hardware Error]: IPID: 0x000100b000000000, Syndrome: 0x000000004a000000
kernel: [ 1873.781509] [Hardware Error]: Instruction Fetch Unit Ext. Error Code: 10, L1 BTB Multi-Match Error.
kernel: [ 1873.781516] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
kernel: [ 2185.078264] mce_notify_irq: 3 callbacks suppressed
kernel: [ 2185.078267] mce: [Hardware Error]: Machine check events logged
kernel: [ 2185.078271] [Hardware Error]: Corrected error, no action required.
kernel: [ 2185.078289] [Hardware Error]: CPU:3 (17:11:0) MC1_STATUS[Over|CE|MiscV|-|-|-|SyndV|-|-|-]: 0xd8200000000a0151
kernel: [ 2185.078300] [Hardware Error]: IPID: 0x000100b000000000, Syndrome: 0x000000004a000000
kernel: [ 2185.078309] [Hardware Error]: Instruction Fetch Unit Ext. Error Code: 10, L1 BTB Multi-Match Error.
kernel: [ 2185.078317] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD
kernel: [ 2185.078325] mce: [Hardware Error]: Machine check events logged
kernel: [ 2185.078326] [Hardware Error]: Corrected error, no action required.
kernel: [ 2185.078333] [Hardware Error]: CPU:0 (17:11:0) MC1_STATUS[Over|CE|MiscV|-|-|-|SyndV|-|-|-]: 0xd8200000000a0151
kernel: [ 2185.078343] [Hardware Error]: IPID: 0x000100b000000000, Syndrome: 0x000000004a000000
kernel: [ 2185.078350] [Hardware Error]: Instruction Fetch Unit Ext. Error Code: 10, L1 BTB Multi-Match Error.
kernel: [ 2185.078357] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD

0 Likes
bambovc
Journeyman III

Same problem:

 

HP Elitebook 735 G5

CPU   AMD Ryzen 5 PRO 2500U w/ Radeon Vega Mobile Gfx

main point of error report:

Instruction Fetch Unit Extended Error Code:10
Instruction Fetch Unit Error:L1 BTB multi-match error
cache level: L1,tx: INSN, men-tx: IRD

Can any one solved this problem?

0 Likes
bambovc
Journeyman III

I solved my problem!

Update the BIOS to latest,and it solved!

0 Likes