I have a program that does the following -
- For loop that goes N times
- During each iteration, an array A of size M is initialized to 0's
- Buffer for A is created and written using clCreateBuffer() and clEnqueueWriteBuffer()
- Then some computation is done with this array and the results are stored for each iteration
The problem here is that for a specific set of inputs, the program crashes during clEnqueueWriteBuffer() after like 30 iterations with the error Unhandled signal in divisionErrorHandler().
I am able to reproduce this issue for the same set of inputs and during the same iteration. I don't see this for other similar inputs. Not sure what the problem here is.
I am using OpenCL 2.0 on AMD FirePro W9100.
Please assist me in resolving this issue. Any help would be greatly appreciated.
I'm not at all sure why clEnqueueWriteBuffer() would trigger any sort of division. Since OpenCL runs asynchronously, however, it seems likely that it's reporting an error actually triggered by your kernel - and not necessarily in the immediately preceding invocation.
I'm also not sure why division of any two floating-point numbers would trigger an error in OpenCL, as AFAIK it should run with exception traps disabled. IEEE-754 specifies the behaviour of the basic operations (add, subtract, multiply, divide, square root) completely in this mode, including for weird cases like dividing infinity by zero (or vice versa). Dividing zero by zero results in a NaN, which still should not trigger an error condition.
It might be easier to troubleshoot your problem if you can provide the means to reproduce it. That means a small piece of code and data which triggers the problem at your end, and which we can also run. Stick it on a pastebin and link it.