2 Replies Latest reply on Jan 31, 2008 8:22 PM by perrin@msli.com

    ACML errror:  in ZGELSD

    perrin@msli.com
      ZGELSD does not work for m,n > 60, ZGELSS works fine

      When trying to use ZGELSD (SVD least squares divide-conquer), I get errors for all problem sizes over m>60
      ** ACML error: on entry to DLASD8 parameter number 2 had an illegal value

      However, when I use ZGELSS (SVD least-squares) on the exact same C code, it works fine (I've tried up to m=1000 and compared it to matlab and everything matches to ~1e-15.)

      I'm using ACML 4.0.1 gfortran64_mp_int64 and gcc 4.2.2 on Linux (quad opteron)

      I've compiled all the code in the examples and performance directory, and those report no errors, and the OpenMP scaling is working well.

      Is there a known issues with ZGELSD?

      Perrrin Meyer
      perrin@MSLI.com
        • ACML errror:  in ZGELSD
          chipf
          We tried a simple test with m > 60 and we can't duplicate the error return. We will need more details about how the application is calling ZGELSD, including all of the parameters. In the worst case, we might need the same data. Can you extract a repro case and post it? Or if you prefer you can email it to the ACML technical support email.
          • ACML errror:  in ZGELSD
            perrin@msli.com
            Did you try calling it from C99 code?

            Here is how I compile the code, the error message, and the C99 code itself. This code works fine using AMD zgelsss, NETLIB LAPACK ZGELSD and Intel MKL ZGELSD
            Thanks,

            Perrin


            /home/perrin/gcc-4.2.2/install/bin/gcc -Wall -std=c99 -O2 -m64 -march=native -fopenmp --save-temps -o lsd_for_amd lsdforamd.c /home/perrin/LAPACK/AMD/ACML4/gfortran64_mp_int64/lib/libacml_mp.a -Xlinker -rpath -Xlinker /home/perrin/gcc-4.2.2/install/lib64/ -lgomp -lgfortran -lm

            [perrin@whisper LAPACK_AMD_TEST]$ ./lsd_for_amd
            *** glibc detected *** ./lsd_for_amd: double free or corruption (out): 0x00002aaaab454010 ***
            ======= Backtrace: =========
            /lib64/libc.so.6[0x35c866a94e]
            /lib64/libc.so.6(__libc_free+0x6e)[0x35c866ae7e]
            ./lsd_for_amd[0x401bdd]
            ./lsd_for_amd[0x4018ca]
            /lib64/libc.so.6(__libc_start_main+0xdc)[0x35c861c4cc]
            ./lsd_for_amd[0x4016d9]
            ======= Memory map: ========
            00400000-00a30000 r-xp 00000000 08:0a 16663171 /home/perrin/GIT/MAPP3DFT/soundfield/LAPACK_AMD_TEST/lsd_for_amd
            00b30000-00b31000 rw-p 00630000 08:0a 16663171 /home/perrin/GIT/MAPP3DFT/soundfield/LAPACK_AMD_TEST/lsd_for_amd
            00b31000-00b8a000 rw-p 00b31000 00:00 0 [heap]
            40000000-40001000 ---p 40000000 00:00 0
            40001000-40a01000 rw-p 40001000 00:00 0
            40a01000-40a02000 ---p 40a01000 00:00 0
            40a02000-41402000 rw-p 40a02000 00:00 0
            41402000-41403000 ---p 41402000 00:00 0
            41403000-41e03000 rw-p 41403000 00:00 0
            35c7900000-35c791a000 r-xp 00000000 08:05 33198 /lib64/ld-2.3.5.so
            35c7a19000-35c7a1a000 r--p 00019000 08:05 33198 /lib64/ld-2.3.5.so
            35c7a1a000-35c7a1b000 rw-p 0001a000 08:05 33198 /lib64/ld-2.3.5.so
            35c8600000-35c872e000 r-xp 00000000 08:05 33208 /lib64/libc-2.3.5.so
            35c872e000-35c882d000 ---p 0012e000 08:05 33208 /lib64/libc-2.3.5.so
            35c882d000-35c8831000 r--p 0012d000 08:05 33208 /lib64/libc-2.3.5.so
            35c8831000-35c8833000 rw-p 00131000 08:05 33208 /lib64/libc-2.3.5.so
            35c8833000-35c8837000 rw-p 35c8833000 00:00 0
            35c8900000-35c8983000 r-xp 00000000 08:05 33226 /lib64/libm-2.3.5.so
            35c8983000-35c8a83000 ---p 00083000 08:05 33226 /lib64/libm-2.3.5.so
            35c8a83000-35c8a84000 r--p 00083000 08:05 33226 /lib64/libm-2.3.5.so
            35c8a84000-35c8a85000 rw-p 00084000 08:05 33226 /lib64/libm-2.3.5.so
            35c9100000-35c910f000 r-xp 00000000 08:05 33246 /lib64/libpthread-2.3.5.so
            35c910f000-35c920f000 ---p 0000f000 08:05 33246 /lib64/libpthread-2.3.5.so
            35c920f000-35c9210000 r--p 0000f000 08:05 33246 /lib64/libpthread-2.3.5.so
            35c9210000-35c9211000 rw-p 00010000 08:05 33246 /lib64/libpthread-2.3.5.so
            35c9211000-35c9215000 rw-p 35c9211000 00:00 0
            35cd800000-35cd809000 r-xp 00000000 08:05 33248 /lib64/librt-2.3.5.so
            35cd809000-35cd908000 ---p 00009000 08:05 33248 /lib64/librt-2.3.5.so
            35cd908000-35cd909000 r--p 00008000 08:05 33248 /lib64/librt-2.3.5.so
            35cd909000-35cd90a000 rw-p 00009000 08:05 33248 /lib64/librt-2.3.5.so
            35cd90a000-35cd91a000 rw-p 35cd90a000 00:00 0
            2aaaaaaab000-2aaaaaaad000 rw-p 2aaaaaaab000 00:00 0
            2aaaaaaad000-2aaaaaab4000 r-xp 00000000 08:0a 16549079 /home/perrin/gcc-4.2.2/install/lib64/libgomp.so.1.0.0
            2aaaaaab4000-2aaaaabb3000 ---p 00007000 08:0a 16549079 /home/perrin/gcc-4.2.2/install/lib64/libgomp.so.1.0.0
            2aaaaabb3000-2aaaaabb4000 rw-p 00006000 08:0a 16549079 /home/perrin/gcc-4.2.2/install/lib64/libgomp.so.1.0.0
            2aaaaabb4000-2aaaaac6d000 r-xp 00000000 08:0a 16549074 /home/perrin/gcc-4.2.2/install/lib64/libgfortran.so.2.0.0
            2aaaaac6d000-2aaaaad6c000 ---p 000b9000 08:0a 16549074 /home/perrin/gcc-4.2.2/install/lib64/libgfortran.so.2.0.0
            2aaaaad6c000-2aaaaad6e000 rw-p 000b8000 08:0a 16549074 /home/perrin/gcc-4.2.2/install/lib64/libgfortran.so.2.0.0
            2aaaaada1000-2aaaab4fc000 rw-p 2aaaaada1000 00:00 0
            2aaaab500000-2aaaab521000 rw-p 2aaaab500000 00:00 0
            2aaaab521000-2aaaab600000 ---p 2aaaab521000 00:00 0
            2aaaab600000-2aaaab621000 rw-p 2aaaab600000 00:00 0
            2aaaab621000-2aaaab700000 ---p 2aaaab621000 00:00 0
            2aaaab700000-2aaaab70d000 r-xp 00000000 08:0a 16549048 /home/perrin/gcc-4.2.2/install/lib64/libgcc_s.so.1
            2aaaab70d000-2aaaab80c000 ---p 0000d000 08:0a 16549048 /home/perrin/gcc-4.2.2/install/lib64/libgcc_s.so.1
            2aaaab80c000-2aaaab80d000 rw-p 0000c000 08:0a 16549048 /home/perrin/gcc-4.2.2/install/lib64/libgcc_s.so.1
            7fffffc9d000-7fffffcba000 rw-p 7fffffc9d000 00:00 0 [stack]
            ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso]
            Aborted

            --------------------------------------------------------------C99 code -----------------------------------------------------------------
            #include <stdio.h>
            #include <stdlib.h>
            #include <math.h>
            #include <complex.h>

            /* #include "acml.h" */
            /* rather than include this, I'm cherry-picking and chaning the function protopye to C99 _Complex so -Wall will shut up */

            /* AMD ACML seems to have issues with zgelsd - zgelss works fine, so far */
            extern void zgelsd(int m, int n, int nrhs, double _Complex *a, int lda, double _Complex *b, int ldb, double *s, double rcond, int *rank, int *info);



            int main(void) {

            int i,j;

            int m = 700;
            int n = 600;


            double _Complex * restrict A;
            A = malloc(m * n * sizeof(double _Complex));
            #define A(itok,jtok) A[ ( ((itok)-1) + ( ((jtok)-1)*m ))]

            double _Complex * restrict b;
            b = malloc(m * sizeof(double _Complex));
            #define b(itok) b[ ( (itok)-1) ]

            for ( i=1 ; i<=m ; i++ ) {
            for ( j=1 ; j<=n ; j++ ) {
            A(i,j) = ((double)rand() / (double)RAND_MAX) + I * ((double)rand() / (double)RAND_MAX);
            }
            }

            for ( i=1 ; i<=m ; i++ ) {
            b(i) = ((double)rand() / (double)RAND_MAX) + I * ((double)rand() / (double)RAND_MAX);
            }



            double * restrict s;
            s = malloc(n * sizeof(double));
            #define s(itok) s[ ( (itok)-1) ]

            int rank;
            int info;

            // call LAPACK least--squares by SVD
            zgelsd(m,n,1,A,m,b,m,s,-1.0,&rank,&info);


            double _Complex * restrict x;
            x = malloc(n * sizeof(double _Complex));
            #define x(itok) x[ ( (itok)-1) ]

            for ( j=1 ; j<=n ; j++ ) {
            x(j) = b(j);
            }


            return 0;
            }