I don't know answer to your question on KMP_AFFINITY, though the Intel Fortran manual also lists -par-affinity compiler option as related functionality that might be different way to accomplish this task.
One additional thing I'll also suggest giving a try is using the Open64 compiler. This compiler supports O64_OMP_SET_AFFINITY and O64_OMP_AFFINITY_MAP environment variables. The latter lets you provide a specific binding of threads to cpu cores. Both are described at: http://developer.amd.com/cpu/open64/onlinehelp/pages/x86_open64_help.htm#Environment-Variables-1
Open64 compiler performs better than Intel Compiler in SPEC OMP2001 benchmark suite!!! As Mike suggested in earlier post, You can use Open64 compiler which is high performance, production quality code generation tool designed for high performance parallel computing workloads. And its Open source. I did this experiment using all the available cores on AMD Mangy-cours machine.
You can download the compiler from : developer.amd.com
In my experiments it seemed Open64 had better overall geomean for SPEC OMP2001. Can you provide details of the suite you are experimenting with? Can you let me know the number of processors (or sockets) on the system? I had used 2 Processor ie it had 24 cores. So I would suggest you try Open64 on Magny Cour and please let me know what your experience is.
As suggested above please use O64_OMP_SET_AFFINITY and O64_OMP_AFFINITY_MAP environment variables to bind the threads and memory to the appropriate cores. This will make sure you have the best performance.
For Intel compiler, as far as I know there is no support for binding threads to cores for non-Intel processors. So using Intel compiler, I would suggest you run as many threads as the total number of cores on your system. The OS will handle thread and memory binding most appropriately and you mostly will get best results. You can then compare this with open64 performance (with open64 you have the opportunity to use O64_OMP_SET_AFFINITY and O64_OMP_AFFINITY_MAP to get best results). Appreciate if you can share the outcome of your experiment.