Inside the project there is a collection of open cl kernels as const char * array. One const char * has one or more kernels in form of C string. During runtime const char * strings collected as cl::Sources and compiled into program using open cl Program::build(). If target device set as GPU everything works nice.
If target device set as CPU Program::build() crashes somewhere in open cl run-time. In attempt to narrow down the problem I have just single kernel which compiles to GPU but crashes when target device is CPU. If this kernel taken to test environment as HelloOpenCl sample it compiles for CPU fine! Kernel Analyzer 1.8 generates x86 assembly with no problem.
Program build() crashes with following message: Unhandled exception at 0x… in project.exe: 0xC000008E: Floating-point division by zero.
Stack trace starts from: mydll!cl:rogram::build(&hellip and dives into amdocl.dll where eventually crashes. (of course, no names as I don't have pdbs).
System: HD5870, Catalist 11.5, sdk 2.4.
CPU: Intel Xeon (don’t think it is relate here).
Could you please advise how can I resolve the problem?
P.S. I have tried to trim the kernel which crashes on CPU (it is not a big kernel though) and at some point it compiles (but useless for me). Again, it crashes only when target device CPU. If target device GPU everything is ok.
Try using this environment variable and see what happens:
AMD_OCL_BUILD_OPTIONS_APPEND="-g -O0"
If it still crashes, then the problem isn't in the optimization.
You problem is very odd since dividing by zero is well defined in floating point (you get inf).
rick.weber !!!
This is it! Passing this parameter to cl:rogram's build as an option as program.build(devices, "-g -O0") also makes the problem disappear.
AMD, Please let me know how can I help to have the problem located and get fixed in the next release. I can generate the crash dump if it will help.
Respect.
sorry for the lare reply.
Can you post a testcase showing this issue here. You can also file a ticket.
CaptainN,
Did this problem ever get reported to, or resolved by, AMD?
I think I've just hit the same problem [Windows 7 x64, AMD APP SDK 2.4, Firepro v8800]
Unhandled exception at 0x0f5d18bb in XXX.exe: 0xC000008E: Floating-point division by zero.
0F5D18B1 fdivr dword ptr [esi+4]
0F5D18B4 lea eax,[esp+118h]
0F5D18BB fstp dword ptr [esi+4]
Register ST0 has the value 0 so I suspect the FDIVR instruction is the cause.
I explicitly trap floating point division by zero in our code
unsigned int flags = _controlfp(0, 0); // get current control word
flags &= ~(_EM_OVERFLOW | _EM_ZERODIVIDE); // enable required exceptions
_controlfp(flags, _MCW_EM); // set control word
Like you I've tried to narrow down my kernel but without any great success.
I eventually narrowed it down to the point where adding this line to my kernel
distances[point_index] = distance;
would cause the floating point division by zero exception to be generated.
[ where: __global float* distances, size_t point_index, float distance]
Also like you, it only fails when the device is CL_DEVICE_TYPE_CPU, and if I try to compile my kernel in a simple test program it compiles correctly, even for CL_DEVICE_TYPE_CPU.
Steve.
steveyoungs,
It is difficult to find why the problem might be happening.
Please post a testcase. and system information: CPU,GPU,SDK,Driver,OS.
System information is easy:
Intel i7 930
Windows 7 x64
AMD APP SDK 2.4
Firepro v8800
driver v8.85
[nvidia GPU and driver and Intel OpenCL CPU drivers also installed]
Although since the problem affected compiling the kernel source to a CPU device, I suspect the GPU and driver are not significant.
As I alluded to in my earlier post, I'm unable to make a simple test case yet - I only get the error when I compile the kernel in our full application. Even then, seemingly insignificant changes to the kernel source can make the problem go away. When I compile the exact same kernel in a simple test program it compiles correctly.
If I don't hear back from CaptainN, or it is not already logged, I'll open a ticket. Do you know the best way to open a ticket?