6 Replies Latest reply on Jun 20, 2011 5:28 PM by Steveyoungs

    cl::Program::build() crashes for x86. Fine for GPU.

    CaptainN
      OpenCL run-time build for x86 crashes with floating point division error.

      Inside the project there is a collection of open cl kernels as const char * array. One const char * has one or more kernels in form of C string. During runtime const char * strings collected as cl::Sources and compiled into program using open cl Program::build().  If target device set as GPU everything works nice.

      If target device set as CPU Program::build() crashes somewhere in open cl run-time. In attempt to narrow down the problem I have just single kernel which compiles to GPU but crashes when target device is CPU. If this kernel taken to test environment as HelloOpenCl sample it compiles for CPU fine! Kernel Analyzer 1.8 generates x86 assembly with no problem.

      Program build() crashes with following message: Unhandled exception at 0x… in project.exe: 0xC000008E: Floating-point division by zero.

      Stack trace starts from: mydll!cl:rogram::build(&hellip and dives into amdocl.dll where eventually crashes. (of course, no names as I don't have pdbs).

      System: HD5870, Catalist 11.5, sdk 2.4.

      CPU: Intel Xeon (don’t think it is relate here).

      Could you please advise how can I resolve the problem?

      P.S. I have tried to trim the kernel which crashes on CPU (it is not a big kernel though) and at some point it compiles (but useless for me). Again, it crashes only when target device CPU. If target device GPU everything is ok.

        • cl::Program::build() crashes for x86. Fine for GPU.
          rick.weber

          Try using this environment variable and see what happens:

          AMD_OCL_BUILD_OPTIONS_APPEND="-g -O0"

          If it still crashes, then the problem isn't in the optimization.

          You problem is very odd since dividing by zero is well defined in floating point (you get inf).

          • cl::Program::build() crashes for x86. Fine for GPU.
            CaptainN

            rick.weber !!!

            This is it! Passing this parameter to cl:rogram's build as an option as program.build(devices, "-g -O0") also makes the problem disappear.

            AMD, Please let me know how can I help to have the problem located and get fixed in the next release. I can generate the crash dump if it will help.

            Respect.

              • cl::Program::build() crashes for x86. Fine for GPU.
                himanshu.gautam

                sorry for the lare reply.

                Can you post a testcase showing this issue here. You can also file a ticket.

                • cl::Program::build() crashes for x86. Fine for GPU.
                  Steveyoungs

                  CaptainN,

                  Did this problem ever get reported to, or resolved by, AMD?

                  I think I've just hit the same problem [Windows 7 x64, AMD APP SDK 2.4, Firepro v8800]

                  Unhandled exception at 0x0f5d18bb in XXX.exe: 0xC000008E: Floating-point division by zero.

                  0F5D18B1  fdivr       dword ptr [esi+4]
                  0F5D18B4  lea         eax,[esp+118h]
                  0F5D18BB  fstp        dword ptr [esi+4]

                  Register ST0 has the value 0 so I suspect the FDIVR instruction is the cause.

                  I explicitly trap floating point division by zero in our code

                      unsigned int flags = _controlfp(0, 0);    // get current control word
                      flags &= ~(_EM_OVERFLOW | _EM_ZERODIVIDE);    // enable required exceptions
                      _controlfp(flags, _MCW_EM);    // set control word

                  Like you I've tried to narrow down my kernel but without any great success.

                  I eventually narrowed it down to the point where adding this line to my kernel

                  distances[point_index] = distance;

                  would cause the floating point division by zero exception to be generated.

                  [ where: __global float* distances, size_t point_index, float distance]

                   

                  Also like you, it only fails when the device is CL_DEVICE_TYPE_CPU, and if I try to compile my kernel in a simple test program it compiles correctly, even for CL_DEVICE_TYPE_CPU.

                   

                  Steve.

                    • cl::Program::build() crashes for x86. Fine for GPU.
                      himanshu.gautam

                      steveyoungs,

                      It is difficult to find why the problem might be happening.

                      Please post a testcase. and system information: CPU,GPU,SDK,Driver,OS.

                        • cl::Program::build() crashes for x86. Fine for GPU.
                          Steveyoungs

                          System information is easy:

                          Intel i7 930

                          Windows 7 x64

                          AMD APP SDK 2.4

                          Firepro v8800

                          driver v8.85

                          [nvidia GPU and driver and Intel OpenCL CPU drivers also installed]

                          Although since the problem affected compiling the kernel source to a CPU device, I suspect the GPU and driver are not significant.

                          As I alluded to in my earlier post, I'm unable to make a simple test case yet - I only get the error when I compile the kernel in our full application. Even then, seemingly insignificant changes to the kernel source can make the problem go away. When I compile the exact same kernel in a simple test program it compiles correctly.

                          If I don't hear back from CaptainN, or it is not already logged, I'll open a ticket. Do you know the best way to open a ticket?