ACML speed concerns

Discussion created by on Jul 30, 2007
Latest reply on Aug 9, 2007 by chipf
I have a simple C code that solves the Navier-Stokes equations on a 1x1 grid using finite elements. It's very simple, and it uses the LAPACK routine {S,D}GESV to solve a linear system of equations. I compiled it against the ACML library, and I'm finding it to be twice as slow at the reference LAPACK/BLAS implementations.

I've tried both the single- and double-precision functions, as well as the 32-bit ACML (compiled against gcc 4.1 on an Athlon 64 X2 5200+) and the 64-bit ACML (compiled against gcc 3.4 on an Opteron 252), and in every instance the reference implementation is at least twice as fast.

I'm just using ACML as a drop-in replacement. Is there something else I should be doing?