I have x399M with Threadripper 1900x, paired with Radeon VII.
I noticed that dxgkrnl always run on Thread 0, no matter what. However after checking some documentation i realized that large PCI-E slots on this mainboard are connected to Numa node 1.
Is it possible somehow to run dxgkrnl.sys on Thread 8 instead (Core0 on NUMA1)?
Solved! Go to Solution.
So the answer is "yes it can"
https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/interrupt-affinity-and-priority
Just be careful to assign an existing core to the driver.
Are there benefits to this configuration?
On system with multiple NUMA nodes, the time required to process DPC calls got decreased. In certain scenarios it may help to reduce input lag.
Also when GPU is used for multimedia decoding and playback, load is more evenly spread across more CPU cores.
Try posting your technical question at AMD Forum Developer from here: https://community.amd.com/t5/newcomers-start-here/bd-p/newcomer-forum
I asked there, no update since.
Lets compare the situation to a NIC card and a NVMe drives.
NVMes are connected to NUMA 0, so they work well by default. No need for any changes.
NIC card allowed me to configure RSS queues like this:
Set-NetAdapterRss –Name "Ethernet" -BaseProcessorNumber "8"
Set-NetAdapterRss –Name "Ethernet" -MaxProcessors "4"
Set-NetAdapterRss –Name "Ethernet" -NumberOfReceiveQueues "4"
Set-NetAdapterRss -Name "Ethernet" -MaxProcessorNumber "14"
Set-NetAdapterRss -Name "Ethernet" -profile "ClosestStatic"
Set-NetAdapterRss -Name "Ethernet" -NUMAnode "1"
This simply means that the driver is using cores 8, 10, 12 and 14 for processing data, memory and other resources associated with NUMA node 1 and everything in this regard is kept on same piece of silicon.
GPU on the other hand is also connected to NUMA node 1, however, DXGkrnl is permanently running on thread 0, Core 0, Numa0.
So the answer is "yes it can"
https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/interrupt-affinity-and-priority
Just be careful to assign an existing core to the driver.
Are there benefits to this configuration?
On system with multiple NUMA nodes, the time required to process DPC calls got decreased. In certain scenarios it may help to reduce input lag.
Also when GPU is used for multimedia decoding and playback, load is more evenly spread across more CPU cores.