AnsweredAssumed Answered

Stop thread hopping between CCXs and unnecessary SMT for Ryzen gaming and app performance:

Question asked by logic on Apr 25, 2017
Latest reply on Jan 18, 2019 by svenbent

With Ryzen there is a separate 8 MB L3 cache per 4 core  CCX. (Core Complex)
That L3 cache is faster than RAM, but the two L3 caches are joined together at the speed of RAM. (It looks like half the speed of RAM due to RAM being DDR)

So whenever a thread hops from one CCX to the other it loses its cached data and has to wait for it to be moved, at the speed of RAM...


Hence the faster the RAM is running; the faster the benchmark due to Windoze 10 loving to move threads from one CCX to the other willy-nilly..!

(This seems to be less of a problem in WIn 7 due to its scheduler being more optimised for the old Intel Core 2 Quads and thus NUMA aware and not move threads around like a hyped up game of 'pass the parcel!' )

A better option is to avoid threads from being moved from one CCX to the other as much as possible.
This should be built into the Windoze 10 scheduler, but isnt yet..!?


The next thing to avoid is SMT as much as possible:

As I understand it; windoze and apps/games don't properly see 'one core and cache, capable of two threads', but two complete core/caches.

Hence it's a good idea to avoid SMT until an app/game, on a CCX has/needs more than 4 threads.


The rules seem to be as follows:


  1. Keep threads from hopping from one CCX to the other and try to keep windows/OS on one CCX and the app or game on the other/s.
  2. Keep to one thread per physical core, until you run out of cores on a CCX. ie: Avoid SMT until you need to run more than 4 threads per CCX/app.
  3. Disable core parking. (Part of AMD's balanced power plan?)


Here are 3 apps that will do that for you, of which Project Mercury seems the most automated and light weight.
I have seen rumours of 50 fps increases in certain older games by using Project Mercury,  but that needs testing and verifying.


Project Mercury: Thread affinities to CCXs, SMT etc optimizations.  Very light weight/efficient.


AMD Ryzen Processor Optimization added to Cacheman 10.10:


Bitsum's Process Lasso: Optimize and automate process CPU affinities:


If I were AMD I would be having a good look at these apps,  benchmarking like hell with these apps and perhaps speaking to the devs to get their heads together and get an ...'official' app onto every Ryzen computer out there, as well as to all the review sites!

I want AMD to succeed and stick it to Intel!  Especially with the 12 and 16 core X399 machines.