cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

twintip31
Adept I

Bitwise OR operation failing when compiling for Tahiti GPU

Hi,

I am currently coding part of MD5 algorithm inside a kernel code and I encountered a strange issue and even if I found some workaround for that, I would like to understand the rootcause...

I tried to isolate the problem in a simpler kernel code, but the issue "obviously" is not occuring with less code before ...

I tried also to compare assembler code (using CodeXL Kernel analyser code generator) but as I have not the Tahiti assembler instructions documented, I cannot really make any conclusion ...

So here is the full kernel code:

__kernel void digest_F_compose(__constant uint16 *X, __global uint4 *r1234)

{

    private uint a,b,c,d;

    private uint FF;

    private uint16 local_X;

    private uint4 local_r1234 = r1234[0];

    local_r1234 = (uint4)(0x67452301,0xefcdab89,0x98badcfe,0x10325476);

    a=local_r1234.s0;

    b=local_r1234.s1;

    c=local_r1234.s2;

    d=local_r1234.s3;

    local_X=X[0];

    FF = (b & c) | (~b & d);

    a+=FF+(local_X.s0)+(0xd76aa478);

    a=rotate(a,(uint)7);

    a+=b;

    FF = (a & b) | (~a & c);

    d+=FF+(local_X.s1)+(0xe8c7b756);

    d=rotate(d,(uint)12);

    d+=a;

    FF = (d & a) | (~d & b);

    c+=FF+(local_X.s2)+(0x242070db);

    c=rotate(c,(uint)17);

    c+=d;

    FF = (c & d) | (~c & a);

    r1234[0].s0=a;

    r1234[0].s1=FF;

    r1234[0].s2=c;

    r1234[0].s3=d;

}

After last RED line (FF last computing), here are the values I get for a,FF,c,d private variables:

a  =0x5F50992F

FF=0x7F739B7F

c  =0x26739273

d  =0x13021920

(note that my initial FF value expectation is 0x5B02192C ....)

At the end when the error occurs, the last FF result is NOT equal to expected result from formula (c & d) | (~c & a)

I suspect the compiler is not working fine and ORing 2 incorrect registers as curiously (a | c) is = to my wrong value of FF here.....!!!!

I tried as a workaround to replace the last FF computation by the equivalent formula : FF = ~(~((c) & (d)) & ~((~c) & (a))); and surprisingly it gave me the correct expected result of initial FF formula !!!

So it tells me I could have discovered an OpenCL compiler issue because at the end GPU is able itself to compute the final values when I change expression of FF ......

For information, my GPU is Tahiti (Radeon HD7950) and my SDK is AMD APP 2.8 release. Catalyst driver version is 13.4

Please advice ....

0 Likes
3 Replies
nou
Exemplar

this is known bug. replace it with bitselect function http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/bitselect.html

twintip31
Adept I

Thanks a lot nou !

It also double confirm a test I did today using a RV710 GPU (HD4xxx)+open CL 1.0 compiler + Catalyst 12.10 => my original code worked on this GPU... so indeed new 13.4 is buggy with HD79xx ......

I had occasion to test the bitselect and it worked really fine !