Gipsel

Weird behaviour when kernels using constant buffers and normal kernels are declared in the same .br file

Discussion created by Gipsel on Feb 2, 2009
Latest reply on Aug 27, 2009 by Gipsel
brook mixes up the access to constant buffers

I see here a strange behaviour of my kernels since I changed from a long list of input arguments constant buffers (because I have some quite similar kernels with just a different number of input parameters). I've looked to the brook generated .cpp file and also the generated IL code and it appears to me that Brook mixes up the access to the constant buffers.

As an example I have the kernels (quite long definitions already, so you may understand why I want to use the constant arrays):

kernel void test2(double a[2][3], double b[2][3], double c [2], double d1, double d2, double d3, double d4, int n, double2 g1[], double g2[], double2 g3[], double2 g4[][], out double2 out1<>, out double out2<>;

kernel void test3(double a[3][3], double b[3][3], double c [3], double d1, double d2, double d3, double d4, int n, double2 g1[], double g2[], double2 g3[], double2 g4[][], out double2 out1<>, out double2 out2<>;

In the brook generated .cpp file I see the arguments are pushed in the same order they are declared, that means constant_0 is created with PushConstantBuffer from a[2][3] or a[3][3], respectively. constant_1 is the b array, constant_2 is the c array. The other arguments are pushed with PushConstant also in the order they are declared.

When looking at the generated IL code, the constantbuffers are used as if array a would be cb0[], b cb1[], and array c can be identified as cb2[]. So far so good.

But unfortunately the constant buffers in IL are declared either as

dcl_cb cb0[5] // should be cb3
dcl_cb cb1[6] // should be cb0 or okay
dcl_cb cb2[6] // should be cb1 or cb0
dcl_cb cb3[2] // should be cb2

or as

dcl_cb cb0[9] // okay
dcl_cb cb1[5] // should be cb3
dcl_cb cb2[3] // okay
dcl_cb cb3[9] // should be cb1

The 5 normal arguments are used as would they sit in cb0[] or cb1[] (the buffer with 5 elements, indexed in the order of declaration). I don't see a scheme in it, it appears to be random and gives of course wrong results. Is this a known problem or what I am doing wrong?

Should I try to change the generated IL so it fits the order the arguments are pushed?

Outcomes