Archives Discussions

rick_weber · ‎07-27-2010

Are the following code snippets not equivilent assuming ldc is a multiple of 2 and c is a double2*?

312       //compute leading dimension in terms of double2s...
313       ldc /= 2;
314
315       c[row + ldc * (col + 3)] = c0_3;
316       c[row + ldc * (col + 2)] = c0_2;
317       c[row + ldc * (col + 1)] = c0_1;
318       c[row + ldc * (col + 0)] = c0_0;
319
320       c[row + 1 + ldc * (col + 3)] = c1_3;
321       c[row + 1 + ldc * (col + 2)] = c1_2;
322       c[row + 1 + ldc * (col + 1)] = c1_1;
323       c[row + 1 + ldc * (col + 0)] = c1_0;
324
325       /*
326       __global double* c2 = (__global double*)c;
327
328       c2[row2 + 0 + ldc * (col + 3)] = c0_3.x;
329       c2[row2 + 0 + ldc * (col + 2)] = c0_2.x;
330       c2[row2 + 0 + ldc * (col + 1)] = c0_1.x;
331       c2[row2 + 0 + ldc * (col + 0)] = c0_0.x;
332
333       c2[row2 + 1 + ldc * (col + 3)] = c0_3.y;
334       c2[row2 + 1 + ldc * (col + 2)] = c0_2.y;
335       c2[row2 + 1 + ldc * (col + 1)] = c0_1.y;
336       c2[row2 + 1 + ldc * (col + 0)] = c0_0.y;
337
338       c2[row2 + 2 + ldc * (col + 3)] = c1_3.x;
339       c2[row2 + 2 + ldc * (col + 2)] = c1_2.x;
340       c2[row2 + 2 + ldc * (col + 1)] = c1_1.x;
341       c2[row2 + 2 + ldc * (col + 0)] = c1_0.x;
342
343       c2[row2 + 3 + ldc * (col + 3)] = c1_3.y;
344       c2[row2 + 3 + ldc * (col + 2)] = c1_2.y;
345       c2[row2 + 3 + ldc * (col + 1)] = c1_1.y;
346       c2[row2 + 3 + ldc * (col + 0)] = c1_0.y;*/

The former gives me the correct result but the latter does not. If they indeed are equivilent, then this is a bug in the OpenCL compiler.

jeff_golds · ‎07-28-2010

Are you dividing ldc by 2 in both cases? If so, that's your bug.

-Jeff

rick_weber · ‎07-28-2010

No, only in the first case is ldc divided by 2. The results seem to indicate that double values from the double2s c0_0 ... c1_3 are not being properly extracted.

jeff_golds · ‎07-28-2010

Originally posted by: rick.weber No, only in the first case is ldc divided by 2. The results seem to indicate that double values from the double2s c0_0 ... c1_3 are not being properly extracted.

Hard to say what's wrong without looking at the whole kernel. If you could post the two kernels somewhere, that would be great.

Jeff

Archives Discussions

Possible bug when treating double2 array as double array