nec_v20

How to easily get a lot more performance from your CPU than overclocking, without overclocking

Discussion created by nec_v20 on Feb 9, 2019
Latest reply on Apr 13, 2019 by ajlueke

Before I start, you have to ask yourself one question, "Where does a CPU get its performance from?".

 

If your answer is, "From the clockspeed of course", or "From the number of cores of course", then I would contend that you would be wrong.

 

You see it really doesn't matter how many things are doing nothing or how fast those things are doing naff-all and this is the problem you will mostly be confronted with when gaming and/or streaming, and it is the reason why games will not run optimally, no matter how much you overclock or how many cores are added to the system.

 

Let's begin at the beginning. Windows of the NT family (which is all that has survived) is a pre-emptive multitasking Operating System. What this means is that all the processes running on the system (and a game would be a process - or perhaps have a number of process threads) get a certain time-slice of attention before they are frozen and another process will get attention.

 

Previous versions of Windows worked on the basis of cooperative mutlitasking whereby a process would have the CPU attention until it relinquished its call. The problem with this is, when a process hangs, then the CPU is basically stuck trying to compute data that is not being requested and will be in a hung state with no way to terminate the process.

 

To get back to the question of where the CPU gets its speed from, the answer is from the data, and the more fluidly and consistently the CPU can get the data, the more speedily it can execute that data.

 

I hear you say, "OK Captain obvious, so what's your effing point?".

 

As CPUs got faster and faster and storage and the data bus started lagging ever far behind, a way had to be found to circumvent this imbalance. Increasing the speed of storage was not and even now in the days of NVME M.2 is not a really viable option.

 

Thus the concept of Cache was born.

 

Cache is basically an area of RAM built into the CPU which runs at close the the same clockspeed of the CPU. Whilst the CPU is working on data in the Cache, the Cache itself is loading data the CPU is going to most likely need next. When needing data, the CPU will always look in the Cache first, and if it doesn't find what it needs there will request it from comparatively very slow memory, or, not finding it there, have to fall back on the positively slothlike storage.

 

So the CPU is happily chomping and digesting the data it gets from the Cache - problem solved?

 

Well remember when I said that Windows was a pre-emptive Operating System? What this means is that other processes get a time-slice of CPU attention, and of course those processes will load data into the Cache. The Cache is multi associative, meaning it can contain disparate data from different sources, this does not however mean that a new process will not overwrite data vital to the another process - like a game which is running - and when that process gets its time-slice again it can find that it's Cache has been marked dirty and will have to go an retrieve the data again from memory or storage - slowing down that process very, very significantly.

 

Of course Micro$haft would not be Micro$haft if it didn't actually go out of its way to kick you in the nuts whilst you were gaming.

 

The Operating System has a mechanism known as "Task Switching" whereby it will take a process waiting for its time-slice on one core and assign it to a completely different core. What this means is that the process will then have absolutely NO data in the Cache and have to go back to either memory or storage to fill it up again, which is incredibly wasteful of CPU time.

 

So as you can see, the performance of a game is not so much dependant on the speed of the CPU or the GPU, but rather how often the CPU can score a Cache hit as opposed to a Cache miss. And no amount of overclocking can get around this fundamental obstacle.

 

So that's it? You are screwed? You are at the mercy of random chance and a lucky streak of Cache hits for performance?

 

Luckily no, and there is something built into the NT family of Operating Systems which guarantees that you will have that lucky streak of Cache hits nearly all the time - thus immeasurably increasing the performance of your CPU.

 

This is called "Affinity", and you can set it if you open the Task Manager, right click on a process and then on "Set Affinity".

What this does is it allows you to dictate which processes can use which Core of your CPU.

 

Ah, so problem solved?

 

Yeah, you wish. The thing is that you have to set every single process Affinity individually, which is a very time consuming grind, and then, to top it all off, when you reboot then you lose all the settings you just made.

 

There is however a utility you can download for free called Process Lasso

https://bitsum.com/

It is one of the few utilities I have found which allows you to manipulate the Affinity of processes easily and very quickly.

Do NOT mess about with the priority of any processes - this is a recipe for disaster.

 

With Process Lasso, you can click on the top running process, then go down to the bottom process, and holding down the shift key click on that, then right-click and then set the affinity for all the processes.

 

Because these are the system processes they only need two cores - so you would assign them to CPU 0 and CPU 1.

If you then load the game, you can assign the game to as many Cores as you want (except of course CPU 0 and CPU 1).

Nearly all games will not use more than four Cores so assigning more Cores to the game can actually be counterproductive - see task switching above.

 

The thing is though, that because the game now has exclusive rights to the Cores you assigned it will also have exclusive rights to the Cache associated with those cores and cannot be interrupted by other processes.

You want to play a game and stream?

 

Easy, assign the game to its Cores and then give the streaming software some other Cores from the ones you have left and have not yet assigned.

I helped build and then configure a Ryzen 2700X system for someone who is a member of the Discord Server that I am a mod on and he wanted to play Fortnite.

 

Leaving the system as it was, out of the box, with only Ryzen master installed we ran Fortnite and the result was 90-135 FPS.

 

I then got rid of SMT (meaning the 2700X ran on eight Cores/eight Threads) and ran Fortnite and the result was a bit better 105-150 FPS.

Then, using Process Lasso I assigned all running processes to CPU0 and CPU1 and assigned Fortnite to CPU2, CPU3, CPU4 and CPU5 - this left two cores CPU6 and CPU7 untouched.

The result was a steady >200-250 FPS, which is on par with what you would get with an i9-9900K overclocked to 5GHz.

He was using the stock AMD cooler for the Ryzen 2700X and not once did it spin up to the point of becoming audible.

Outcomes