Are the following code snippets not equivilent assuming ldc is a multiple of 2 and c is a double2*?
312 //compute leading dimension in terms of double2s...
313 ldc /= 2;
314
315 c[row + ldc * (col + 3)] = c0_3;
316 c[row + ldc * (col + 2)] = c0_2;
317 c[row + ldc * (col + 1)] = c0_1;
318 c[row + ldc * (col + 0)] = c0_0;
319
320 c[row + 1 + ldc * (col + 3)] = c1_3;
321 c[row + 1 + ldc * (col + 2)] = c1_2;
322 c[row + 1 + ldc * (col + 1)] = c1_1;
323 c[row + 1 + ldc * (col + 0)] = c1_0;
324
325 /*
326 __global double* c2 = (__global double*)c;
327
328 c2[row2 + 0 + ldc * (col + 3)] = c0_3.x;
329 c2[row2 + 0 + ldc * (col + 2)] = c0_2.x;
330 c2[row2 + 0 + ldc * (col + 1)] = c0_1.x;
331 c2[row2 + 0 + ldc * (col + 0)] = c0_0.x;
332
333 c2[row2 + 1 + ldc * (col + 3)] = c0_3.y;
334 c2[row2 + 1 + ldc * (col + 2)] = c0_2.y;
335 c2[row2 + 1 + ldc * (col + 1)] = c0_1.y;
336 c2[row2 + 1 + ldc * (col + 0)] = c0_0.y;
337
338 c2[row2 + 2 + ldc * (col + 3)] = c1_3.x;
339 c2[row2 + 2 + ldc * (col + 2)] = c1_2.x;
340 c2[row2 + 2 + ldc * (col + 1)] = c1_1.x;
341 c2[row2 + 2 + ldc * (col + 0)] = c1_0.x;
342
343 c2[row2 + 3 + ldc * (col + 3)] = c1_3.y;
344 c2[row2 + 3 + ldc * (col + 2)] = c1_2.y;
345 c2[row2 + 3 + ldc * (col + 1)] = c1_1.y;
346 c2[row2 + 3 + ldc * (col + 0)] = c1_0.y;*/
The former gives me the correct result but the latter does not. If they indeed are equivilent, then this is a bug in the OpenCL compiler.