15% Performance Increase by unsetting DISPLAY
I've noticed a very odd performance change, based on the value of the DISPLAY environment variable in GPU kernels in OpenCL with the 2.2 release of the StreamSDK.
The performance of kernels is improved when the DISPLAY environment variable is unset, versus when it is set to the local X session. This can be seen across multiple programs, including the StreamSDK examples. See below for a demonstration by using the MatrixMulImage sample, run 1000 times, and the GFlops rating averaged across the 1000 iterations. This is done twice, first with the DISPLAY environment variable unset, and then with the environment variable set to the local X session.
When the DISPLAY variable is unset, we average 105 GFlops. When it is set, we only get 89 GFlops. This appears to be restricted to GPU kernels. No statistically significant difference was observed with when running on the CPU.
Can anybody explain this?
% env -u DISPLAY ./MatrixMulImage -i 1000 --device gpu -t | awk '/GFlops/ { SUM += $4; COUNT++;} END {print SUM/COUNT;}'
105.421
% env DISPLAY=:0 ./MatrixMulImage -i 1000 --device gpu -t | awk '/GFlops/ { SUM += $4; COUNT++;} END {print SUM/COUNT;}'
89.4114