2 Replies Latest reply on Jun 18, 2014 2:41 AM by pinform

    Is OpenCL.dll/amdocl.dll bugged?

    cadorino

      Hi,

      I'm working on a .NET layer abstraction for OpenCL programming in F# (FSCL (FSCLFramework) on Twitter). I'm working with the AMD APP SDK 2.9 on a APU A10 + HD 7990.
      During some heavy testing (millions of saxpy kernel runs with various input sizes and memory flags), I found my application was subtly and randomly crashing in release mode. So I started some investigation using windbg and it turned out that was for an error in critical section locking-unlocking.

      Surprisingly, it looks like windbg reports this exception even when running a very simple program where I only get the list of available platforms.

       

       

      unsigned int count;

      cl_platform_id plat_ids[100];

      clGetPlatformIDs(NULL, NULL, &count);

      clGetPlatformIDs(count, plat_ids, NULL);

       

      Here's what windbg says:

      VERIFIER STOP 00000209: pid 0x16F8: critical section over-released or corrupted

       

       

        0504E2E0 : Critical section address

        FFFFFFFF : Lock count

        00000000 : Expected least significant bit

        05048DC0 : Critical section debug info address

       

      SYMBOL_NAME:  amdocl!aclWriteToMem+839218

       

      STACK_TEXT: 

      0041f60c 5f2a86f3 00000209 5f293db0 0504e2e0 verifier!VerifierStopMessage+0x1f8

      0041f648 5f2a8e93 0504e2e0 00000001 0041f6bc verifier!AVrfpVerifyCriticalSectionOwner+0xca

      0041f668 6446f3a8 0504e2e0 63c349e9 76f4cee9 verifier!AVrfpRtlLeaveCriticalSection+0x6c

      WARNING: Stack unwind information not available. Following frames may be wrong.

      0041f6a0 63bf5fe6 0041f6bc 00000000 76f4cee9 amdocl!aclWriteToMem+0x839218

      0041f714 63bf32d6 00000000 63bdd7a1 00000000 amdocl!clRetainSampler+0x24f76

      0041f73c 63bc6e91 0041f770 653310b1 00000000 amdocl!clRetainSampler+0x22266

      0041f744 653310b1 00000000 00000000 0041f768 amdocl!clIcdGetPlatformIDsKHR+0x11

      0041f770 65333b1b 0041f784 00000000 00000103 OpenCL+0x10b1

      0041fb98 7758c057 6533eda8 00000000 00000000 OpenCL!clWaitForEvents+0x14b

      0041fbb4 7706d60e 6533eda8 65333a70 00000000 ntdll!RtlRunOnceExecuteOnce+0x33

      0041fbcc 65333a64 6533eda8 65333a70 00000000 kernel32!InitOnceExecuteOnce+0x17

      0041fbf4 0018197d 00000000 00000000 0041fc0c OpenCL!clWaitForEvents+0x94

      0041fda4 00181d19 00000001 05067290 0505ee20 Saxpy!main+0x6d

      0041fde4 7705338a 7efde000 0041fe30 77589f72 Saxpy!__tmainCRTStartup+0xfd

      0041fdf0 77589f72 7efde000 74755169 00000000 kernel32!BaseThreadInitThunk+0xe

      0041fe30 77589f45 00181d81 7efde000 00000000 ntdll!__RtlUserThreadStart+0x70

      0041fe48 00000000 00181d81 7efde000 00000000 ntdll!_RtlUserThreadStart+0x1b

       

      A full dump is available here: http://www.gabrielecocco.it/dumpFile.dmp (60MB, quite big)

      It looks like this is inside the OpenCL implementation. WDYT?

       

      Gabriele