I found CodeXL failed to debug my kernel but was able to debug simple kernels from the SDK. Today I decided to try to narrow it down. I think I finally figured it out. CodeXL seems to be unable to handle kernels with more than 3 arguments!
I modified the BinarySearch SDK sample with the following kernel
_kernel void debugKernel(
int numItems, /* Total number of items in the batch. */
__const __global float4* input, /* input: float3, float, float3, float. */
__global int4* results /* output: int triangleID, float hitT, int2 padding. */
, __const __global float4* nodesA /* not used. */
int tid; // thread index.
float hitT = 0.f;
tid = get_global_id(0);
if (tid >= numItems)
STORE_RESULT( tid, tid, hitT );
If I comment out 'nodesA' and the corresponding clSetKernelArg in runDebugKernel(), CodeXL will stop at the breakpoint inside 'debugKernel'. With the unused 4th argument 'nodesA'. I got the following error:
I can't believe I am the only one trying to pass more than 3 arguments to a kernel.
Not being able to debug my OCL kernel make life difficult.
I have attached a complete project to help with repo. Thanks.
Your deduction is close, but not exact - since the issue is the specific buffer and not the number of parameters. If you commented out a different buffer, the same issue would appear, while if you duplicate one of the others (e.g. have input1 and input2), it would still work.
The issue is that the debugger does not currently support cl_mem parameters with a value (i.e pointer address) of 0 / NULL.
If you take the code you attached, and replace
cl_mem nodesA = 0;
cl_mem nodesA = clCreateBuffer(
The debugging works perfectly.
Since the runtime allows NULL mem handles, the debugger should support it, but for now - as a workaround, you can create a dummy buffer (as you see from my sample code, it doesn't have to be big) and pass it instead - since the buffer was originally NULL, that means it couldn't have been accessed anyway...