ACML 5.1. same speed for zgemm with and without FMA4?

Question asked by andersartig on Feb 13, 2012
We are using a program which uses lot's of zgemm-calls (up to 80% of the program-running time).


I'm testing the program on bulldozer-cpu's with acml 5.1, one time with fma4, another time without fma4.

Both times compiled with Intel Fortran Compiler 12.


I cannot find any major differences in the runtimes. Is this correct? Or should FMA4 speed-up zgemm?


