I am seeing some interesting behavior of a new Itel chips, vs barcelonas, and MKL ACML for the DGEMV kernel,
Problem size is 1010, openMP 8 cores
CPU MFlop/s BLAS Lib
opt2356 838 ACML 4.1
E5530 4435 ACML 4.1
opt2356 858 MKL
E5530 3743 MKL
Strange thing is the DGEMM() kernel and DDOT() are about the same speeds on both systems. With both BLAS libraries. ACML has issues with dgemm() on the Intel and MKL has issues with dgemm() on the amd, no surpise.
I expected the tripple channgel memory bandwdith of the Intel to show an 50% improvment in the ddot() and similar kernels, but am not.
I do like the imporoved DGEMV() performance of the new intel platform, and I wish I would have tested it on a Shanghi, I also like how ACML is getting perofmrnace bumps in DGEMV() the same as MKL. Portability is nice must say.
Any comments would be liked.