Problem after change from 2.2 to 2.3

Discussion created by n.treutner on Jan 11, 2011
Latest reply on Jan 30, 2011 by n.treutner
Kernel produces wrong results

Hi There,

I've only reecntly switched from 2.2 to 2.3 and now parts of my code, that worked fine in 2.2 don't work anymore. i narrowed the problem (or at least one part of it) down to this kernel (attached), by checking the Buffers with the GDebugger.

What the kernel does, is: it compares to images (inputimage1 and 2) pixelwise. i initialize as many work-units as there are pixels (x, y) and then compare them over a window (X and Y), moving to the left of the image (with ww). the results for each comparison (so, variation of ww) are saved inside a 3d array, that actually is a normal array (dsi).

It appears, that the results, that are saved via

dsi[(x + y * width) * w_range + w_dsi] = sum0.w;

are the same inside each kernel (so the work-unit (x,y) writes 231678 to each field of the dsi, that it is supposed to write to), while they should change, since i change ww (via the for loop).

so i tried to find any changes in the handling of images and/or loops, but i couldn't find any relevant stuff.

i wonder if you guys have any idea, what is wrong.

__kernel void ssdkernel(__read_only image2d_t inputImage1, __read_only image2d_t inputImage2, __global float * dsi, __constant int * params, sampler_t imageSampler) { int x = get_global_id(0); int y = get_global_id(1); int width = get_global_size(0); int height = get_global_size(1); int k = (params[0] - 1)/2; int w = params[1]; int w_min = params[2]; int stepping = params[3]; int w_range = (w - w_min) / stepping; float4 diff0, sum0; //iterate over all disparities for (int ww = w_min, w_dsi = 0; ww < w; ww += stepping, w_dsi++) { diff0 = 0.0f; sum0 = 0.0f; //match the windows for(int Y = -k; Y <= k; Y=Y+1) { for(int X = -k; X <= k; X=X+1) { diff0 = read_imagef(inputImage1, imageSampler, (int2)(x + X, y + Y)) - read_imagef(inputImage2, imageSampler, (int2)(x + X - ww, y + Y)); sum0 += diff0 * diff0; } } sum0.w = sqrt(sum0.x * sum0.x + sum0.y * sum0.y + sum0.z * sum0.z); //compute the euclidian distance dsi[(x + y * width) * w_range + w_dsi] = sum0.w; } }