With Ryzen there is a separate 8 MB L3 cache per 4 core CCX.
Windows loves moving threads from core to core in an effort, I assume, to keep each core equally loaded.
As the L3 cache of each core complex is connected via the Infinity Fabric bus, the moving of a thread also requires the moving of its cache data via this bus.
The Infinity Fabric bus runs at the memory controller's speed and is shared by the memory controller, PCIe, etc.
Hence moving a thread causes unnecessary load on the Infinity Fabric bus and latencies, leading to less than optimal performance.
SMT is an issue with older software optimized for Intel HyperThreading:
It seems to misinterpret things and consider virtual cores as physical cores, complete with a full L1, L2 etc cache.
As this software is not all going to be updated, a good workaround is to avoid SMT until there are more related threads than cores per CCX.
The developer of Project Mercury seems to have written a really nice app that does a good job of solving the above issues and is really light on resources despite the strange name choice and lack of advertising skills:
Other apps with similar capability:
AMD Ryzen Processor Optimization added to Cacheman 10.10:
Bitsum's Process Lasso:
While I can understand the above logic, I have no background in software development beyond some programs written in Basic in the 80's, I beleive this or a similar app, perhaps with prewritten profiles similar to what Nvidia does with their drivers, should be part of the software Ryzen software package..!?
I post this here in the hope that someone of some import and without a 'not invented here' complex, sees it.