AnsweredAssumed Answered

how to lock a thread during a multi-threaded code execution

Question asked by don_peppe_di_prata on Mar 7, 2013
Latest reply on Mar 12, 2013 by yurtesen

I am using Gauss by Aptech programming language on a cluster with 32 cores per node, see

 

http://products.amd.com/pages/opteroncpudetail.aspx?id=648&AspxAutoDetectCookieSupport=1

 

and 128 Gb RAM, Linux 64bit. I am running some 32 threads codes written by me in Gauss handy and straightforward syntax.

 

During execution, I have noticed that the CPU time is taken for 85% by the user and for 15% by the system. I suspect this is due to Moving threads between cores, i.e. thread switching. Because of this, scaling from 16 to 32 threads does not -- generally -- improve performance. CPU times level to the 16 threads even when executing a 32 threads code of mine. Only in 5% of the experiments performed, it occurred mysteriously, that the system CPU % dropped to 0% and the performance swelled to almost linear figures.

 

Therefore, I have been suggested to use the syntax

 

KMP_AFFINITY=proclist=[0-15],explicit

 

to modify the environment variable so to lock a core for each CPU. Unfortunately, this is an Intel processors syntax. While I am using an Opteron.

 

Therefore, my question is: which is the equivalent syntax for Opteron based systems ?

 

Finally, does anyone have an explanation for the mystery I depicted above: i.e. several runs of the same code in which in a few cases on the same cluster node performance is much better than in the rest of the cases. A few cases in which user's CPU time is near 100% while system time goes to 0%.

Ideally, this would be the kind of execution pursued in view of an higher scalability.

Outcomes