1 Reply Latest reply on Nov 3, 2009 12:16 PM by chipf

    acml 4.3.0 + openmp = Segmentation fault (core dumped)

      gfortran, ifort

      I use gfortran 4.4.1 (and ifort 11.0) with acml 4.3.0 in Ubuntu 9.10 32bit (and Scientific Linux 5.3 32bit). I compile this code:



            program eigenlapack      

            PARAMETER        (n=1000)

            PARAMETER        (lwork=8*n)      

            integer           m, info, iwork(5*n), IL, IU, ifail(n),i,j

            REAL              w(n), work(lwork), a(n,n), VL, VU, z(n,n)




            IL = 1

            IU = 5


            do 10 i=1,n

              do 20 j=1,n


         20 continue

         10 continue


      !$omp parallel default(firstprivate)


            CALL SSYEVX ('N', 'A',  'U', n, a, n, VL, VU, IL, IU, 0.0d0, 

           1              m, w, z, n, work, lwork,  iwork, ifail, info)



            write(*,'(3x,a)') 'Number of eigenvalues:'

            write(*,'(3x,I100)') m

            write(*,'(3x,a)') 'Eigenvalues:'

            write(11,'(10000f12.2)') (w(i),i=1,n)


      !$omp end parallel




      If I set "export OMP_NUM_THREADS=1" then it works correctly. But if I use more than one thread (OMP_NUM_THREADS=2) then it don't work:


      Segmentation fault (core dumped)


      It work with variable n=416 or less. 

      Single version of code work right.



      What can I do?

        • acml 4.3.0 + openmp = Segmentation fault (core dumped)

          The bug is easily reproduced.  It happens with a 64-bit OS also, and when linking either with the single threaded or OpenMP ACML library. I used ACML 4.3.0 gfortran64 and gcc/gfortran 4.4.1 to reproduce the problem.

          GDB quickly reveals that the segfault occurs on the openmp parallel clause, and looks immediately like a stack size issue.  A quick search in google turned up similar issues.

          I resolved the problem like this:  Change the declaration for the REAL variables to:

          REAL,save::     w(n), work(lwork), a(n,n), VL, VU, z(n,n)

          I think this is moving these larger arrays off of the stack and into allocated memory.  You could also solve the problem by increasing the available stack size.

          Such a simple change, but it fixed the problem for me.