I recently have been trying to port one of my cuda programs to Brook+. I must say that the process has been extremely painful since the brook compiler is exteremely picky. Even completely valid C code with Preprocess macros refuses to compiler inside my kernels with no helpful error messages.
For example, in my program I am trying to write a simple hash bruteforcer. The first bit of my code looks like this:
#define F1(x, y, z) (z ^ (x & (y ^ z)))
#define F2(x, y, z) F1(z, x, y)
#define F3(x, y, z) (x ^ y ^ z)
#define F4(x, y, z) (y ^ (x | ~z))
#define MD5STEP(f, w, x, y, z, data, s) \
( w += f(x, y, z) + data, w &= 0xffffffff, w = w<<s | w>>(32-s), w += x )
#define rol(value, bits) (((value) << (bits)) | ((value) >> (32 - (bits))))
#define W(i) W##i
uint64 plainToHash(uint32 x0, uint32 x1, uint32 x2, uint32 size)
uint32 a, b, c, d;
uint32 W0, W1, W2, W3, W4, W5, W6, W7, W8, W9, W10, W11, W12, W13, W14, W15;
a = 0x67452301; b = 0xEFCDAB89; c = 0x98BADCFE; d = 0x10325476;
W0 = x0;
W1 = x1;
W2 = x2;
W3 = 0;
W4 = 0;
W5 = 0;
W6 = 0;
W7 = 0;
W8 = 0;
W9 = 0;
W10 = 0;
W11 = 0;
W12 = 0;
W13 = 0;
W14 = size<<3;
W15 = 0;
MD5STEP(F1, a, b, c, d, W( 0)+0xd76aa478, 7);
MD5STEP(F1, d, a, b, c, W( 1)+0xe8c7b756, 12);
MD5STEP(F1, c, d, a, b, W( 2)+0x242070db, 17);
MD5STEP(F1, b, c, d, a, W( 3)+0xc1bdceee, 22);
MD5STEP(F1, a, b, c, d, W( 4)+0xf57c0faf, 7);
MD5STEP(F1, d, a, b, c, W( 5)+0x4787c62a, 12);
MD5STEP(F1, c, d, a, b, W( 6)+0xa8304613, 17);
MD5STEP(F1, b, c, d, a, W( 7)+0xfd469501, 22);
MD5STEP(F1, a, b, c, d, W( 8)+0x698098d8, 7);
MD5STEP(F1, d, a, b, c, W( 9)+0x8b44f7af, 12);
MD5STEP(F1, c, d, a, b, W(10)+0xffff5bb1, 17);
MD5STEP(F1, b, c, d, a, W(11)+0x895cd7be, 22);
MD5STEP(F1, a, b, c, d, W(12)+0x6b901122, 7);
MD5STEP(F1, d, a, b, c, W(13)+0xfd987193, 12);
MD5STEP(F1, c, d, a, b, W(14)+0xa679438e, 17);
MD5STEP(F1, b, c, d, a, W(15)+0x49b40821, 22);
All of which is valid c and works in my cuda program when I compile. Brook+ throws a fit and refuses to compile any of it.
I even tried preprocessing the code using a valid c compiler and using that as the source but that didn't work either.
I really don't want to start developing programs for ATI products if they cant even preprocess correctly. I am disappointed in Brook+, since I own an ATI GPU and I cannot even program with it. Please consider fixing the little things before the big things.
Originally posted by: genaganna use uint instead of uint32.
64 bit integers not support yet in Brook+
Hmmm. I have:
typedef unsigned int uint32;
typedef unsigned long long int uint64;
Technically, CUDA doesnt have 64 bit datatypes either, but It works with them anyways like a good c compiler should. Again, its the little things that need to be fixed.
@Raistmer: not being able to copy from a pdf? Again, little things like that. Its very convenient to not have to make an account to download the brook sdk. Cuda makes everything completely public. ATI you should take a hint.