3 Replies Latest reply on Mar 28, 2012 12:06 AM by jkmcrobe

    segmentation fault with multithreaded library

    jkmcrobe

      When I compile the acml lapack examples and linking to the single threaded library (libacml), everything works great.  But when I try the examples using the mutlithreaded library (libacml_mp) I get a segmentation fault after the program ends.  For example, when I compile the dgesdd example using the following:

       

      gcc -m64 -I/home/james/acml5.1.0/gfortran64/include dgesdd_c_example.c -L/home/james/acml5.1.0/gfortran64/lib -lacml -lgfortran -static -lm

       

      and I run a.out I get get this:

       

      ACML example: SVD of a matrix A using dgesdd

      --------------------------------------------

       

      Matrix A:

      -0.5700  -1.2800  -0.3900   0.2500

      -1.9300   1.0800  -0.3100  -2.1400

        2.3000   0.2400   0.4000  -0.3500

      -1.9300   0.6400  -0.6600   0.0800

       

      Singular values of matrix A:

        3.9147   2.2959   1.1184   0.3237

       

      But when I compile dgesdd using the multithreaded library, like this:

       

      gcc -fopenmp -m64 -I/home/james/acml5.1.0/gfortran64_mp/include dgesdd_c_example.c -L/home/james/acml5.1.0/gfortran64_mp/lib -lacml_mp -lgfortran -static -lm

       

      and run a.out I get:

       

      ACML example: SVD of a matrix A using dgesdd

      --------------------------------------------

       

      Matrix A:

      -0.5700  -1.2800  -0.3900   0.2500

      -1.9300   1.0800  -0.3100  -2.1400

        2.3000   0.2400   0.4000  -0.3500

      -1.9300   0.6400  -0.6600   0.0800

       

      Singular values of matrix A:

        3.9147   2.2959   1.1184   0.3237

      Segmentation fault

       

      The program gives the correct answer, but seg faults.  My system:

       

      OS: Ubuntu 11.10 64-bit

      CPU: Intel Core i7-2600

      gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1

      GNU Fortran (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1

        • Re: segmentation fault with multithreaded library
          chipf

          It looks like this is easy to duplicate.  The examples use dynamic linking, you are forcing static linking.  When using static, the segfault occurs.  To verify this is the same issue, run the example with gdb, and when it reports the segfault, use the bt command to look at the stack trace.  On my system, this is showing a jump to zero, with calls to destroy_unit_mutex and close_unit_1 on the stack..

           

          A quick way around this is to remove the -static.  With dynamic linking you'll have to be careful that LD_LIBRARY_PATH is set properly.

           

          We will have to debug this issue with static linking.

            • Re: segmentation fault with multithreaded library
              jkmcrobe

              Sweet, thanks, I'll give that a shot.

              • Re: segmentation fault with multithreaded library
                jkmcrobe

                I get the same.  From gdb:

                 

                [Thread debugging using libthread_db enabled]

                ACML example: SVD of a matrix A using dgesdd

                --------------------------------------------

                 

                Matrix A:

                -0.5700  -1.2800  -0.3900   0.2500

                -1.9300   1.0800  -0.3100  -2.1400

                  2.3000   0.2400   0.4000  -0.3500

                -1.9300   0.6400  -0.6600   0.0800

                [New Thread 0x7ffff7ffc700 (LWP 2091)]

                [New Thread 0x7ffff77fb700 (LWP 2092)]

                [New Thread 0x7ffff6ffa700 (LWP 2093)]

                [New Thread 0x7ffff67f9700 (LWP 2094)]

                [New Thread 0x7ffff5ff8700 (LWP 2095)]

                [New Thread 0x7ffff57f7700 (LWP 2096)]

                [New Thread 0x7ffff4ff6700 (LWP 2097)]

                 

                Singular values of matrix A:

                  3.9147   2.2959   1.1184   0.3237

                 

                Program received signal SIGSEGV, Segmentation fault.

                0x0000000000000000 in ?? ()

                (gdb) bt

                #0  0x0000000000000000 in ?? ()

                #1  0x00000000005bff0a in destroy_unit_mutex ()

                #2  0x00000000005c0d40 in close_unit_1 ()

                #3  0x00000000005c0e0a in _gfortrani_close_units ()

                #4  0x0000000000401729 in cleanup ()

                #5  0x00000000005ead77 in __libc_csu_fini ()

                #6  0x00000000005f22a1 in __run_exit_handlers ()

                #7  0x00000000005f2323 in exit ()

                #8  0x00000000005ea561 in __libc_start_main ()

                #9  0x0000000000402631 in _start ()

                 

                Thanks for your help.  I'll just link dynamically.

                1 of 1 people found this helpful