Try using this environment variable and see what happens:
If it still crashes, then the problem isn't in the optimization.
You problem is very odd since dividing by zero is well defined in floating point (you get inf).
This is it! Passing this parameter to cl:rogram's build as an option as program.build(devices, "-g -O0") also makes the problem disappear.
AMD, Please let me know how can I help to have the problem located and get fixed in the next release. I can generate the crash dump if it will help.
sorry for the lare reply.
Can you post a testcase showing this issue here. You can also file a ticket.
Did this problem ever get reported to, or resolved by, AMD?
I think I've just hit the same problem [Windows 7 x64, AMD APP SDK 2.4, Firepro v8800]
Unhandled exception at 0x0f5d18bb in XXX.exe: 0xC000008E: Floating-point division by zero.
0F5D18B1 fdivr dword ptr [esi+4]
0F5D18B4 lea eax,[esp+118h]
0F5D18BB fstp dword ptr [esi+4]
Register ST0 has the value 0 so I suspect the FDIVR instruction is the cause.
I explicitly trap floating point division by zero in our code
unsigned int flags = _controlfp(0, 0); // get current control word
flags &= ~(_EM_OVERFLOW | _EM_ZERODIVIDE); // enable required exceptions
_controlfp(flags, _MCW_EM); // set control word
Like you I've tried to narrow down my kernel but without any great success.
I eventually narrowed it down to the point where adding this line to my kernel
distances[point_index] = distance;
would cause the floating point division by zero exception to be generated.
[ where: __global float* distances, size_t point_index, float distance]
Also like you, it only fails when the device is CL_DEVICE_TYPE_CPU, and if I try to compile my kernel in a simple test program it compiles correctly, even for CL_DEVICE_TYPE_CPU.
It is difficult to find why the problem might be happening.
Please post a testcase. and system information: CPU,GPU,SDK,Driver,OS.
System information is easy:
Intel i7 930
Windows 7 x64
AMD APP SDK 2.4
[nvidia GPU and driver and Intel OpenCL CPU drivers also installed]
Although since the problem affected compiling the kernel source to a CPU device, I suspect the GPU and driver are not significant.
As I alluded to in my earlier post, I'm unable to make a simple test case yet - I only get the error when I compile the kernel in our full application. Even then, seemingly insignificant changes to the kernel source can make the problem go away. When I compile the exact same kernel in a simple test program it compiles correctly.
If I don't hear back from CaptainN, or it is not already logged, I'll open a ticket. Do you know the best way to open a ticket?