4 Replies Latest reply on Mar 28, 2013 10:00 AM by steabert

    Issue with ACML dcopy on AMD Opteron 6276




      I've been having this odd problem with the dcopy function in the ACML blas library. When copying a lot of elements (~10**9), dcopy seems to fail at a certain point. I tested this with the included small fortran program. The odd thing is that with my system blas everything works fine. When I tried MKL, it also failed at the exact same point. When I run the code on an intel processor, it also runs fine.


      I used ACML 5.2.0 on AMD Opteron 6276, Linux 3.2.0-23 x86_64, and compiled as follows: gfortran test_dcopy.f /opt/acml5.2.0/gfortran64/lib/libacml.a


      kind regards,


        • Issue with ACML dcopy on AMD Opteron 6276

          Update: it seems to be a problem that was introduced in version 4.3.0. The new features section of 4.3.0 mentions: "Level 1 BLAS routines have been tuned for AMD Istanbul processors. Routines affected include xDOT, xCOPY, xAXPY, and xSCAL routines."


          When I tried my small test case with dcopy from ACML 4.2.0 it works, with 4.3.0 it fails.

            • Re: Issue with ACML dcopy on AMD Opteron 6276

              I think this will be considered a bug.   The problem is caused by an intermediate result that multiplies N by the element size.  Any time this overflows a 32-bit integer the routine will fail.  This problem will happen in any of the Level 1 copy routines, at different values of N depending on the size of an array element.


              The test does work if you build and link with the 64-bit integer library, and that may be considered a work around.


              For all of the blas routines, at some point arrays are too large to use 32-bit address computation and it is necessary to use the 64-bit integer libraries. We can change the copy routines to delay the size at which this occurs - as you point out it used to work!


              This will be resolved in our upcoming 5.3 release.

              1 of 1 people found this helpful