4 Replies Latest reply on May 25, 2013 7:16 PM by twintip31

    Bitwise OR operation failing when compiling for Tahiti GPU

    twintip31

      Hi,

       

      I am currently coding part of MD5 algorithm inside a kernel code and I encountered a strange issue and even if I found some workaround for that, I would like to understand the rootcause...

       

      I tried to isolate the problem in a simpler kernel code, but the issue "obviously" is not occuring with less code before ...

      I tried also to compare assembler code (using CodeXL Kernel analyser code generator) but as I have not the Tahiti assembler instructions documented, I cannot really make any conclusion ...

       

      So here is the full kernel code:

      __kernel void digest_F_compose(__constant uint16 *X, __global uint4 *r1234)

      {

          private uint a,b,c,d;

          private uint FF;

          private uint16 local_X;

          private uint4 local_r1234 = r1234[0];

          local_r1234 = (uint4)(0x67452301,0xefcdab89,0x98badcfe,0x10325476);

          a=local_r1234.s0;

          b=local_r1234.s1;

          c=local_r1234.s2;

          d=local_r1234.s3;

          local_X=X[0];

          FF = (b & c) | (~b & d);

          a+=FF+(local_X.s0)+(0xd76aa478);

          a=rotate(a,(uint)7);

          a+=b;

          FF = (a & b) | (~a & c);

          d+=FF+(local_X.s1)+(0xe8c7b756);

          d=rotate(d,(uint)12);

          d+=a;

          FF = (d & a) | (~d & b);

          c+=FF+(local_X.s2)+(0x242070db);

          c=rotate(c,(uint)17);

          c+=d;

          FF = (c & d) | (~c & a);

          r1234[0].s0=a;

          r1234[0].s1=FF;

          r1234[0].s2=c;

          r1234[0].s3=d;

      }

       

      After last RED line (FF last computing), here are the values I get for a,FF,c,d private variables:

      a  =0x5F50992F

      FF=0x7F739B7F

      c  =0x26739273

      d  =0x13021920

      (note that my initial FF value expectation is 0x5B02192C ....)

       

      At the end when the error occurs, the last FF result is NOT equal to expected result from formula (c & d) | (~c & a)

      I suspect the compiler is not working fine and ORing 2 incorrect registers as curiously (a | c) is = to my wrong value of FF here.....!!!!

       

      I tried as a workaround to replace the last FF computation by the equivalent formula : FF = ~(~((c) & (d)) & ~((~c) & (a))); and surprisingly it gave me the correct expected result of initial FF formula !!!

       

      So it tells me I could have discovered an OpenCL compiler issue because at the end GPU is able itself to compute the final values when I change expression of FF ......

       

      For information, my GPU is Tahiti (Radeon HD7950) and my SDK is AMD APP 2.8 release. Catalyst driver version is 13.4

       

      Please advice ....