I have an application that uses the C++ bindings and is using exceptions for error reporting. The application is a simple NBody simulation, with 2 kernels, one for interaction, and one for forward Euler. The latter runs fine, but the interaction kernel is silently omitted when I target CPU. On GPU it (more or less) runs fine, however the CPU version is practically not launched. No printf has any effect, the kernels seem to finish intantly, and the associated event holds garbage info.
My entire initialize sequence is a giant try block, and nothing throws. I also tried using regular error codes (just to make sure), but all initialization goes fine. I initially used the cl::make_kernel facilitation, but to be able to throw when creating kernels, I dropped their usage. (It is a huge setback, that cl::make_kernel has no default constructor. As a result, they cannot be intialized in try blocks (beause they are destroyed on scope exit), and if I initialize them through pointers, one loses the nice syntax of operator()(...), the very reason one is using it in the first place.) Bottom line is, things work on GPU, and they don't on CPU.
I read the CL_EVENT_COMMAND_EXECUTION_STATUS of the interaction kernel after waiting on the command_queue on which I enqueued it. The value is UINT_MAX (4294967295).
The hpp, cpp and cl files are as simple as they can get. If anybody wants to test, just omit using the "read_particle_file()" function and use a vector of default initialized structs. The point is to get the kernel running. Tried using both Catalyst 14.12, and also 15.4-beta.
Either I am doing something noobish and I'm up for a facepalm, or there is a major issue that the runtime fails to report. Please, gimme some ideas, because I ran out of them.