10 Replies Latest reply on Aug 4, 2014 2:27 AM by pinform

    Bug of Catalyst drivers of version 12.10 and later running with Radeon HD 7000 series cards

    aaa

      An intricate bug has been found in the latest Catalyst drivers when running OpenCL kernels with constant indexes accessing the __local memory.

      Conditions: The bug only emerges in each group when get_local_id(0) == 0 and get_local_id(1) == 0 or 14 in a (16, 16, 1) dimension grid.

      Example:

      __local float s_srcPatch[10][10];

      __local float s_dstPatch[20][16];

      float sum;

      sum =       (evenFlag * 0.0625f) * s_srcPatch[0][1 + ((tidx - 2) / 2)];

      sum = sum + ( oddFlag * 0.25f  ) * s_srcPatch[0][1 + ((tidx - 1) / 2)];

      sum = sum + (evenFlag * 0.375f ) * s_srcPatch[0][1 + ((tidx    ) / 2)];

      sum = sum + ( oddFlag * 0.25f  ) * s_srcPatch[0][1 + ((tidx + 1) / 2)];

      sum = sum + (evenFlag * 0.0625f) * s_srcPatch[0][1 + ((tidx + 2) / 2)]; // Operation 5

      s_dstPatch = sum;

       

       

      In the case above, the operation 5 will be ignored. The final sum is incorrect.

       

       

      __local float s_srcPatch[10][10];

      __local float s_dstPatch[20][16];

      float sum;

      sum =       (evenFlag * 0.0625f) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx - 2) / 2)];

      sum = sum + ( oddFlag * 0.25f  ) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx - 1) / 2)];

      sum = sum + (evenFlag * 0.375f ) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx    ) / 2)];

      sum = sum + ( oddFlag * 0.25f  ) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx + 1) / 2)];

      sum = sum + (evenFlag * 0.0625f) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx + 2) / 2)]; // get_local_size(1) == 16 which take the same effect as above code segment

      s_dstPatch = sum;

       

       

      If I use an expression "get_local_size(1) - 16", everything goes fine. I can get the correct sum.