cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

rick_weber
Adept II

Possible bug when treating double2 array as double array

Are the following code snippets not equivilent assuming ldc is a multiple of 2 and c is a double2*?

312       //compute leading dimension in terms of double2s...
313       ldc /= 2;
314
315       c[row + ldc * (col + 3)] = c0_3;
316       c[row + ldc * (col + 2)] = c0_2;
317       c[row + ldc * (col + 1)] = c0_1;
318       c[row + ldc * (col + 0)] = c0_0;
319
320       c[row + 1 + ldc * (col + 3)] = c1_3;
321       c[row + 1 + ldc * (col + 2)] = c1_2;
322       c[row + 1 + ldc * (col + 1)] = c1_1;
323       c[row + 1 + ldc * (col + 0)] = c1_0;
324
325       /*
326       __global double* c2 = (__global double*)c;
327
328       c2[row2 + 0 + ldc * (col + 3)] = c0_3.x;
329       c2[row2 + 0 + ldc * (col + 2)] = c0_2.x;
330       c2[row2 + 0 + ldc * (col + 1)] = c0_1.x;
331       c2[row2 + 0 + ldc * (col + 0)] = c0_0.x;
332
333       c2[row2 + 1 + ldc * (col + 3)] = c0_3.y;
334       c2[row2 + 1 + ldc * (col + 2)] = c0_2.y;
335       c2[row2 + 1 + ldc * (col + 1)] = c0_1.y;
336       c2[row2 + 1 + ldc * (col + 0)] = c0_0.y;
337
338       c2[row2 + 2 + ldc * (col + 3)] = c1_3.x;
339       c2[row2 + 2 + ldc * (col + 2)] = c1_2.x;
340       c2[row2 + 2 + ldc * (col + 1)] = c1_1.x;
341       c2[row2 + 2 + ldc * (col + 0)] = c1_0.x;
342
343       c2[row2 + 3 + ldc * (col + 3)] = c1_3.y;
344       c2[row2 + 3 + ldc * (col + 2)] = c1_2.y;
345       c2[row2 + 3 + ldc * (col + 1)] = c1_1.y;
346       c2[row2 + 3 + ldc * (col + 0)] = c1_0.y;*/

The former gives me the correct result but the latter does not. If they indeed are equivilent, then this is a bug in the OpenCL compiler.

0 Likes
3 Replies

Are you dividing ldc by 2 in both cases?  If so, that's your bug.

-Jeff

0 Likes

No, only in the first case is ldc divided by 2. The results seem to indicate that double values from the double2s c0_0 ... c1_3 are not being properly extracted.

0 Likes

Originally posted by: rick.weber No, only in the first case is ldc divided by 2. The results seem to indicate that double values from the double2s c0_0 ... c1_3 are not being properly extracted.

 

Hard to say what's wrong without looking at the whole kernel.  If you could post the two kernels somewhere, that would be great.

Jeff

0 Likes