AnsweredAssumed Answered

wrong placement of instances

Question asked by albert.solernou on Jun 13, 2012
Latest reply on Jun 19, 2012 by albert.solernou


I am benchmarking some hybrid MPI-OpenMP code we developed on several platforms using several compilers. When using opencc (4.5.1-1 AMD patched, both compiled from source and pre-compiled) I see that threads lump into the same core, which obviously leads to a poor performance. This happens on a 2 socket machine using CPU Opteron 6128 (so 16 processors) and OpenMPI (versions 1.4.4 and 1.6) running an updated Ubuntu server.


However, this issue does not happen when using any other compiler. Explicitly, I tested GNU's gcc 4.6, Intel's icc 12.0, and even the community developed Open64's opencc 5.0.


Find attached a sample code that fails on placing correctly the instances, as well as two snapshots of htop that show the placement of the instances (wrk), when running this code using two threads and two processes.


I was recommended by AMD guys to use your compiler for best results, and we'd obviously love to publish best results for it.

Do you have any advice?