cancel
Showing results for 
Search instead for 
Did you mean: 

General Discussions

AduZob
Journeyman III

Fans constantly at high speed on new AMD Epyc server

Hi,

I have recently purchased the following unit: “System ASUS RS720A-E11-RS12E/10G | 2U / 12-Bay | GPU” equipped with dual epyc 7713 cpus.

I installed ubuntu 22.04 on the system and it boots up fine, but whenever the machine is turned on the fans always run at high (max?) speed.

When I check the IPMI hardware monitor in the BIOS, it reads 7020 RPM for four fans, and

CPU1 temperature: 43 degC, CPU2 temperature 36 degC, TR1 temperature: 22 degC.

I tried upgrading to the latest kernel version (6.3.7) but this did not solve the issue.

I installed lm_sensors, and the sensors output is given below. It seems like a lot of the sensors are giving off an alarm (?). The TSI temp readings are nonsensical (e.g. +3892314.0°C), and CPUTIN reads +127.5°C, which seems alarmingly high.

Any advice would be greatly appreciated. For the time being, I can’t even figure out whether (i) the readings are correct and there is a hardware problem, or (ii) the readings are nonsense and there is a software issue.

------------------------------------------------------------------------------------------------------------

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +44.8°C  
Tccd1:        +37.5°C  
Tccd2:        +37.8°C  
Tccd3:        +38.2°C  
Tccd4:        +38.5°C  
Tccd5:        +35.2°C  
Tccd6:        +38.0°C  
Tccd7:        +38.8°C  
Tccd8:        +38.0°C  

nvme-pci-c100
Adapter: PCI adapter
Composite:    +23.9°C  (low  = -20.1°C, high = +89.8°C)
                       (crit = +94.8°C)

k10temp-pci-00cb
Adapter: PCI adapter
Tctl:         +41.5°C  
Tccd1:        +37.0°C  
Tccd2:        +35.5°C  
Tccd3:        +34.8°C  
Tccd4:        +36.8°C  
Tccd5:        +36.5°C  
Tccd6:        +35.0°C  
Tccd7:        +36.0°C  
Tccd8:        +36.0°C  

nct6793-isa-0290
Adapter: ISA adapter
in0:                     2.04 V  (min =  +0.00 V, max =  +1.74 V)  ALARM
in1:                   160.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                     3.33 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                     3.31 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                   296.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                   120.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                   168.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                     3.33 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                     3.33 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                     0.00 V  (min =  +0.00 V, max =  +0.00 V)
in10:                  160.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                  168.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                  168.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                  160.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                  184.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                     0 RPM  (min =    0 RPM)
fan2:                     0 RPM  (min =    0 RPM)
SYSTIN:                +107.0°C  (high =  +0.0°C, hyst =  +0.0°C)  ALARM  sensor = thermistor
CPUTIN:                +127.5°C  (high = +80.0°C, hyst = +75.0°C)  ALARM  sensor = CPU diode
AUXTIN0:                +94.0°C    sensor = thermistor
AUXTIN1:               +107.0°C    sensor = thermistor
AUXTIN2:               +105.0°C    sensor = thermistor
AUXTIN3:               +105.0°C    sensor = thermistor
PCH_CHIP_CPU_MAX_TEMP:   +0.0°C  
PCH_CHIP_TEMP:           +0.0°C  
PCH_CPU_TEMP:            +0.0°C  
PCH_MCH_TEMP:            +0.0°C  
TSI2_TEMP:             +3892314.0°C  
TSI3_TEMP:             +3892314.0°C  
TSI4_TEMP:             +3892314.0°C  
TSI5_TEMP:             +3892314.0°C  
TSI6_TEMP:             +3892314.0°C  
TSI7_TEMP:             +3892314.0°C  
intrusion0:            ALARM
intrusion1:            ALARM
beep_enable:           disabled

 

0 Likes
0 Replies