I have a Threadripper 2990WX installed on an MSI X399 SLI PLUS, cooled by a Noctua NH-U14S TR4, with 32GB of G.Skill 3600MT/s CL19. The motherboard's VRMs are actively cooled by a 3300rpm 60mm fan (that happens not to be too noisy). I use the latest (A60) BIOS for the motherboard.
In the BIOS, I have set "Precision Boost Overdrive" to "Manual", and set a maximum socket power of 350W, TDC of 250A and EDC of 350A. I have also applied a negative voltage offset of 0.08V, because PBO was applying too much voltage (my CPU is rock-stable at manual 3.5Ghz 1.06V, PBO was applying 1.16-1.18V). The problem I describe below also happens when no voltage offset is in use.
When I run benchmarks, I observe two seemingly incoherent behaviors:
- MPrime: the total package consumption jumps to 350W, all the cores boost to 3.45 Ghz (1.11V). After 30 seconds, Tdie reaches 68C and the processor throttles to 550 Mhz for half a second, then continues with the benchmark. Throttling happens every 4 to 5 seconds. The VRM temps are 81C.
- My own Python and Numpy code, that mixes AVX and regular integer instructions, on all 64 threads: the total package consumption jumps to the same 350W, all the cores boost to 3.6Ghz (1.18V), but the temperature stabilizes at 62C. The VRM temps are 75C.
Where is that energy going? Why is MPrime, consuming 350W total package power, somewhat producing more heat than my code, also consuming 350W? Because I'm running Linux, I don't have access to Ryzen Master. I use rapl-read-ryzen, that uses the power consumption MSRs of Zen. The readings seem to be what PBO uses (even if they may be incorrect), as they are properly capped at 350W, as instructed in the BIOS.
Another question is why PBO allows the CPU to reach throttling temperatures? Isn't it supposed to slowly decrease the frequency when we approach 68C?
Thank you for your advice