Hi all
I've been having trouble when running 2 dimensional kernels.
I tried to use the matrix multiplication kernel (mmmKernel_local2) provided in ATi OpenCL samples in my code, but the output was all zeros.
I traced the problem down to the get_global_id(1) - it returns some strange numbers. So I made a simple kernel just to check:
__kernel void foo(__global float4 *out) {
out[get_global_id(0)] = (float4)get_global_id(1);
}
The output for a 4x4 (global range) kernel looks like this:
3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009,3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009, 3.43597e+009,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
If I replace out[get_global_id(0)] = (float4)get_global_id(0); with out[get_global_id(0)] = (float4)get_global_id(1); everything works fine:
0, 0, 0, 0, 1, 1, 1, 1,
2, 2, 2, 2, 3, 3, 3, 3,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
When I collect the application trace the results verify that the kernel is run with the dimensions that I've specified. I really have no idea what the problem could be, especially because the example where I got the kernel from works fine, and multiplies matrices just as it should. Could it be VS project configuration issue, or something else I'm overlooking? Please help.
Thanks.