adminc

From a Cuda User

Discussion created by adminc on Jul 12, 2009
Latest reply on Jul 13, 2009 by adminc
ATI need to fix this

I recently have been trying to port one of my cuda programs to Brook+.  I must say that the process has been extremely painful since the brook compiler is exteremely picky.  Even completely valid C code with Preprocess macros refuses to compiler inside my kernels with no helpful error messages. 

 

For example, in my program I am trying to write a simple hash bruteforcer.  The first bit of my code looks like this:

#define F1(x, y, z) (z ^ (x & (y ^ z)))
#define F2(x, y, z) F1(z, x, y)
#define F3(x, y, z) (x ^ y ^ z)
#define F4(x, y, z) (y ^ (x | ~z))

#define MD5STEP(f, w, x, y, z, data, s) \
    ( w += f(x, y, z) + data, w &= 0xffffffff, w = w<<s | w>>(32-s), w += x )

#define rol(value, bits) (((value) << (bits)) | ((value) >> (32 - (bits))))

#define W(i) W##i


uint64 plainToHash(uint32 x0, uint32 x1, uint32 x2, uint32 size)
{
    uint32 a, b, c, d;

    uint32 W0, W1, W2, W3, W4, W5, W6, W7, W8, W9, W10, W11, W12, W13, W14, W15;

    a = 0x67452301; b = 0xEFCDAB89; c = 0x98BADCFE;  d = 0x10325476;

    W0  = x0;
    W1  = x1;
    W2  = x2;
    W3  = 0;
    W4  = 0;
    W5  = 0;
    W6  = 0;
    W7  = 0;
    W8  = 0;
    W9  = 0;
    W10 = 0;
    W11 = 0;
    W12 = 0;
    W13 = 0;
    W14 = size<<3;
    W15 = 0;


    MD5STEP(F1, a, b, c, d, W( 0)+0xd76aa478,  7);
    MD5STEP(F1, d, a, b, c, W( 1)+0xe8c7b756, 12);
    MD5STEP(F1, c, d, a, b, W( 2)+0x242070db, 17);
    MD5STEP(F1, b, c, d, a, W( 3)+0xc1bdceee, 22);
    MD5STEP(F1, a, b, c, d, W( 4)+0xf57c0faf,  7);
    MD5STEP(F1, d, a, b, c, W( 5)+0x4787c62a, 12);
    MD5STEP(F1, c, d, a, b, W( 6)+0xa8304613, 17);
    MD5STEP(F1, b, c, d, a, W( 7)+0xfd469501, 22);
    MD5STEP(F1, a, b, c, d, W( 8)+0x698098d8,  7);
    MD5STEP(F1, d, a, b, c, W( 9)+0x8b44f7af, 12);
    MD5STEP(F1, c, d, a, b, W(10)+0xffff5bb1, 17);
    MD5STEP(F1, b, c, d, a, W(11)+0x895cd7be, 22);
    MD5STEP(F1, a, b, c, d, W(12)+0x6b901122,  7);
    MD5STEP(F1, d, a, b, c, W(13)+0xfd987193, 12);
    MD5STEP(F1, c, d, a, b, W(14)+0xa679438e, 17);
    MD5STEP(F1, b, c, d, a, W(15)+0x49b40821, 22);

 

 

All of which is valid c and works in my cuda program when I compile.  Brook+ throws a fit and refuses to compile any of it.

 

I even tried preprocessing the code using a valid c compiler and using that as the source but that didn't work either. 

 

I really don't want to start developing programs for ATI products if they cant even preprocess correctly.  I am disappointed in Brook+, since I own an ATI GPU and I cannot even program with it.  Please consider fixing the little things before the big things.

 

Outcomes