cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

yarr
Journeyman III

Could use some help with arrays

weird index behavior

Hello everyone!

Trying to get into brook+ GPU programming, I encountered a weird error with array indexes. Here's my test kernel:

kernel void testkernel(uint2 ker_in<>, uint ker_key[2], uint ker_s87[256], out uint2 ker_out<>)
{
 uint ker_n1, ker_n2, t, z, mhm;
 uint2 test;
 
 ker_n1 = (uint)ker_in.x;
 ker_n2 = (uint)ker_in.y;

 t = (ker_n1+ker_key[0]) & (uint)0xFFFFFFFF;
 mhm=t>>(uint)24 &(uint)255;
 z = ker_s87[mhm];

 test.x = z;
 test.y = mhm;
 ker_out = (uint2)test;
}

Everything works okay, except of the highlited part.
Somehow "z" just returns zero.

"mhm" itself returns proper values: from zero to 255, and also if I just manually substitute the index in the "z" expression, for example:

z = ker_s87[158];

I get proper result, with z returning the needed element.

I can't really understand what's wrong here and why doesnt it work in proper way. 😞

Thanks in advance for any ideas or suggestions!

0 Likes
43 Replies

Gaurav,

So, I finally got this working (I think, lol, ).

All I had to do was switch the instance() calls, for example:

FROM:
x = instance().x;
y = instance().y;

TO:
x = instance().y;
y = instance().x;

Everything else (all the other code and gather array accesses, etc) were left the same way, like F. I didn't need to change them to F.

I honestly hope this saves some people time in the future.

Thanks for all the help gaurav, I really appreciate it!

I also understand that this can probably be combed through in the SDK examples but it would be nice to have it documented (if it isn't already) somewhere too. Thanks.
0 Likes

Originally posted by: gaurav.garg
unsigned int dimc[] = {height,width};

When using C++ style constructor it has to be -

unsigned int dimc[] = {width, height};





Which is actually quite confusing (and not really documented?) as it is indexed and accessed (for instance as a gather input) like a C array which is declared just the other way around.

Btw., I got my code running now also on the GPU. It is really fast, about  120GFlop/s (double precision) get used on a HD4870. It is quite close to the theoretical maximum for the instruction mix (not that much MAD_64 in there and Brook does not generate any, presumably for precision reasons?). But it would have been far easier, if some standard functions would also be supported for doubles. I had to build my own exp and sqrt versions for doubles in IL. The rest of it was written in Brook.

0 Likes

Sorry, I didn't mention instance() returns short vector index that is similiar to indexof().

So, instance().x is column index, instance().y is row index and so on.

 

0 Likes
dukeleto
Adept I

I can confirm that in the examples you can find lines like the following (this one extracted from ImageFilter.br):


int jj = instance().x;
int ii = instance().y;

// These are the offsets so no looping is needed

o_img = img[ii][jj] * mask[0][0];



Personally I find this unnerving! It works, but seeing reference code with something like i=instance().y; j=instance().x
is really disconcerting.

Regards
0 Likes