I want to finally confirm if this is a known issue or not.
A problem that I have been looking at in recent times was, that the AMD Epyc 7551p uses approximately 10% more power in a configuration that can use the Max Core Boost Frequency of the Processor compared to a configuration that could not use the Max Core Boost Frequency, regardless of whether one of the cores actually runs at the Max Core Boost Frequency.
For notational purposes lets consider a notation (a,b,c,d) where a is the number of active cores (= not in the deep C2 C-State) in the first NUMA Node, b in the second NUMA Node and so on.
If I'm understanding the documentation correctly, the max core boost frequency can be used by the processors of one NUMA Node if there are 3 or less cores outside of the C2 C-State in that NUMA node.
For example in the configuration (3,3,4,0), The Max Core Boost Frequency of 3Ghz for that Processor could be utilized in the first, second and fourth NUMA Node, but not in the third where there are more than 3 processors active
Executing the NPB Embarrassingly Parallel benchmark in a (8,8,8,0) configuration, were the Max Core Boost Frequency could theoretically be used in the 4th NUMA Node (but isnt used in practice) will consume about 9% more energy than executing the benchmark at the same core count in a (6,6,6,6) configuration, where the Max Core Boost Frequency can not be used, with no deterioration in performance.
table.png shows a comparison of different configurations that can (at least theoretically) utilize the Max Core Boost Frequency with ones that can not utilize it.
The columns are labeled:
Configuration | Alternative Configuration | Difference in Energy Consumption | Difference in the Runtime of the Benchmark | Overall reduction of Energy Consumption