Hi! I use GetDP with MUMPS (4.10) and ACML (5.3.1) instead BLAS (ATLAS, OpenBLAS). I use "gfortran64_fma4_mp" version of ACML. When I set environment variable OMP_NUM_THREADS=4 (or more) on system with FX-6300 or OMP_NUM_THREADS=3 (or 4) for A8-5600K the computing performance dramatically fall down. What's wrong? Or it's normally?
The ACML 5.3 series had new threading logic added, that appears to be binding threads unoptimally on certain systems. Can you try this experiment with an older release, such as 4.4 from the archive page?