I have many servers that use AMD Opteron(tm) processor 6376. I installed qemu- KVM virtual machines on these servers，but the performance of windows 7 on the virtual machines is very weak. I want to know how to improve it? thank you !
To inspect the current frequency of the processor, install cpupower with the package manager of your distro. If low, you can increase it.
cpupower is a suite of command-line tools - which allows you to select the power governor for the cpu, to set the min and the max frequencies of the cpu, to monitor the frequency of the cpu, etc.
You can also change the governor to increase or decrease the frequency.
A break-down of the governors:
conservative = scales the CPU frequency to load
ondemand = scales the CPU frequency to load
userspace= runs the CPU at a specific, user-given frequency
performance = runs the CPU at the max frequency
powersave = runs the CPU at the min frequency
schedutil = runs/scales the CPU frequency based on a scheduler
My current cpupower config sets the lowest frequency (to the second lowest frequency supported) and uses the schedutil governor.
cpupower frequency-set -g schedutil -d 1700MHz
Unfortunately - with your cpu, there are only 2 "available frequency steps" -
available frequency steps: 2.30 GHz, 1.40 GHz
(1) You could force the cpu to run at a constant clock of 2.30 GHz. If doing so, monitor the temperature.
sudo cpupower frequency-set -g userspace -f 2300MHz
(2) You could set the minimum frequency to the base clock and allow the schedutil governor to up-clock - when needed. If doing so, monitor the temperature.
sudo cpupower frequency-set -g schedutil -d 2300MHz
(3) You could allow the performance governor to run the cpu at its maximum speed. (However, the performance governor is known to fail and down-clock some cpus well-below their maximum.)
sudo cpupower frequency-set -g performance
One thing that you might try is benchmarking the difference with Node Interleaving in the BIOS set to Auto, and with Node Interleaving disabled. When node interleaving is "on"/Auto, then memory is striped across all processors averaging the latency. When its off, the local memory attached to a cpu is seen with a low latency, and memory attached to one of the other sockets is seen with a larger latency.
On my Opteron 63xx setup, if you can run node interleaving disabled, and keep your thread affinities tied down to the cores on a single processor die, then the latency is going to be much better. Higher FPS in games etc. The downside is if the memory footprint of that app extends outside of the ram attached to that processor die/MCM. Then the latency is going to be awful in relative terms.