Hello everyone!
Trying to get into brook+ GPU programming, I encountered a weird error with array indexes. Here's my test kernel:
kernel void testkernel(uint2 ker_in<>, uint ker_key[2], uint ker_s87[256], out uint2 ker_out<>)
{
uint ker_n1, ker_n2, t, z, mhm;
uint2 test;
ker_n1 = (uint)ker_in.x;
ker_n2 = (uint)ker_in.y;
t = (ker_n1+ker_key[0]) & (uint)0xFFFFFFFF;
mhm=t>>(uint)24 &(uint)255;
z = ker_s87[mhm];
test.x = z;
test.y = mhm;
ker_out = (uint2)test;
}
Everything works okay, except of the highlited part.
Somehow "z" just returns zero.
"mhm" itself returns proper values: from zero to 255, and also if I just manually substitute the index in the "z" expression, for example:
z = ker_s87[158];
I get proper result, with z returning the needed element.
I can't really understand what's wrong here and why doesnt it work in proper way. 😞
Thanks in advance for any ideas or suggestions!
Originally posted by: gaurav.garg unsigned int dimc[] = {height,width};When using C++ style constructor it has to be -
unsigned int dimc[] = {width, height};
Which is actually quite confusing (and not really documented?) as it is indexed and accessed (for instance as a gather input) like a C array which is declared just the other way around.
Btw., I got my code running now also on the GPU. It is really fast, about 120GFlop/s (double precision) get used on a HD4870. It is quite close to the theoretical maximum for the instruction mix (not that much MAD_64 in there and Brook does not generate any, presumably for precision reasons?). But it would have been far easier, if some standard functions would also be supported for doubles. I had to build my own exp and sqrt versions for doubles in IL. The rest of it was written in Brook.
Sorry, I didn't mention instance() returns short vector index that is similiar to indexof().
So, instance().x is column index, instance().y is row index and so on.