3 Replies Latest reply on Jul 11, 2012 2:36 AM by yurtesen

    hit crash: semaphore.cpp:87: sem_wait() failed



           currently I am trying to covert a project from CUDA to opencl , and i am using AMDSDK 2.7, and multi core .

      for most unit test cases, it works fine, but recently when I am trying to run some long benchmark, but continue hit below crash.


      ../../../thread/semaphore.cpp:87: sem_wait() failed

        0x00002b0618a8b265 in raise () from /lib64/libc.so.6

      #4  0x00002b0618a8cd10 in abort () from /lib64/libc.so.6

      #5  0x00002b0619a6e029 in ?? () from /remote/terascale/OpenCL/AMDAPP/AMD-APP-SDK-v2.7-RC-lnx64/lib/x86_64/libamdocl64.so

      #6  0x00002b0619a6d35b in ?? () from /remote/terascale/OpenCL/AMDAPP/AMD-APP-SDK-v2.7-RC-lnx64/lib/x86_64/libamdocl64.so

      #7  0x00002b0619a6d076 in ?? () from /remote/terascale/OpenCL/AMDAPP/AMD-APP-SDK-v2.7-RC-lnx64/lib/x86_64/libamdocl64.so

      #8  0x00002b0619a60009 in ?? () from /remote/terascale/OpenCL/AMDAPP/AMD-APP-SDK-v2.7-RC-lnx64/lib/x86_64/libamdocl64.so

      #9  0x00002b0619a3a299 in clEnqueueReadBuffer () from /remote/terascale/OpenCL/AMDAPP/AMD-APP-SDK-v2.7-RC-lnx64/lib/x86_64/libamdocl64.so


      ../../../thread/semaphore.cpp:87: sem_wait() failed

      ../../../thread/semaphore.cpp:87: sem_wait() failed

      ../../../thread/semaphore.cpp:87: sem_wait() failed

      ../../../thread/semaphore.cpp:87: sem_wait() failed

      ../../../thread/semaphore.cpp:87: sem_wait() failed

      ../../../thread/semaphore.cpp:87: sem_wait() failed

      ../../../thread/semaphore.cpp:87: sem_wait() failed

      ../../../thread/semaphore.cpp:87: sem_wait() failed

      ../../../thread/semaphore.cpp:87: sem_wait() failed



      Those benchmarks are more likely execute some kernel functions multiple  times, for example if I reduce the threshold from 100000 times to 20000,

      we will not hit this crash, so do any one encounter this issue before, i suspect it's a SDK issue.



        • Re: hit crash: semaphore.cpp:87: sem_wait() failed



          Is this old thread helpful?




          Re: sem_wait() failed with APP 2.6


          Mike DelormeNewbie

          Google seems to have cached a copy of a discussion on this error, however I can't find it on the AMD forums either.  The topic title was APP 2.6 regression: creating queue fails with "sem_wait() failed".  It was posted on Jan. 9, 2012.  The last update that I can find reads as follows:

          I've got some further information on this: I think it is some interaction between the AMD OpenCL implementation and the NVIDIA OpenGL library. Even though I'm not using doing any GL interop, libamdocl64.so is linked against libGL.so.1, and on my system that by default picks up NVIDIA's libGL.so.1. If I use LD_LIBRARY_PATH to point things at a random build of Mesa's libGL.so.1 I happened to have lying around, all is well.

          I did notice that APP 2.5 had occasional random hangs and segfaults on shutdown inside libnvidia-tls. I guess 2.6 has turned occasional conflicts into a totally reproducable conflict :-/

          Your machine wouldn't happen to have had the NVIDIA drivers installed on it at one point, would it?



            • Re: hit crash: semaphore.cpp:87: sem_wait() failed

              The NVIDIA drivers is also installed in this machine,

              but the crash I encounter is not like: creating queue fails with "sem_wait() failed"

              and the crash only can be reproduced by some long run benchmark which will repeat call kernel functions.


              actually i also encounter  creating queue fails with "sem_wait() failed"  when I try to use gdb

                • Re: hit crash: semaphore.cpp:87: sem_wait() failed

                  I still would put the original libGL files in place and replace nvidia installed ones before blaming the SDK.

                  If you use elrepo for instaling the package, it will install the files in /usr/lib64/nvidia and this wont be a problem.

                  The problem is because nvidia driver is overwriting /usr/lib64/libGL.* files.If you are installing driver manually, I think you can tell it to install those files to /usr/lib64/nvidia (I think there is a command switch which you can give to installer)

                  Alternatively you can delete /usr/lib64/libGL* files and re-install mesa-gl (which will put original libgl files. If you are using your nvidia card to drive the display, this disables 3D support, but you can still use CUDA.