10 Replies Latest reply on Dec 23, 2008 1:22 PM by gaurav.garg

    Constant buffers naming convention in Brook kernel generated IL

    jean-claude
      using brook generated IL in CAL

      Hi Gaurav,

      If my understanding is correct there are 15 constant buffers,
      which can contain upto 1024*4 elements.

      The naming goes from cb0 to cb14.

      So consider the following kernel

      kernel void k_trial(out float A<>, float BO<>, float B1<>, float C0, float C1) {
          A = B0*C0 + B1*C1;
      }

      To use the generated IL in CAL, I would assume the following naming for binding:

      input            A  <=>  o0

      outputs        B0 <=>  i0        B1 <=>  i1

      constants     C0 <=>  cb0      C1 <=>  cb1

      That sounds fine for input & outputs, but for constants it seems that there
      are both folded in cb0 through :

      dcl_cb cb0[2]



      Questions:

      (1) so what are the variable names for C0 and C1 ?

          CALname C0_name = 0;
          calModuleGetName(&C0_name,  ctx, module_k_trial, "???");
          calCtxSetMem(ctx, C0_name, C0_Mem);

          CALname C1_name = 0;
          calModuleGetName(&C1_name,  ctx, module_k_trial, "???");
          ...


      (2) is this to say that for n constants the generated IL would be dcl_cb cb0[n]?
          but then when are cb1, ..., cb14 used?

       

      Thanks for some hints.

      Jean-Claude

        • Constant buffers naming convention in Brook kernel generated IL
          gaurav.garg

          Hi Jean-Claude,

          All the contants declared are combined into a single constant buffer. So, you need to bind this data to single constant buffer. When you allocate data it has to be 128-bit aligned (always allocate *_4 CAL format) for each constant as CAL has 128-bit alignment requiremnts with constants.

          calResAlloc*(, , 2, CAL_FORMAT_FLOAT_4, ); // Resource of Width 2 (for 2 constants)

          calResMap(&ptr, constRes, );
          ptr[0] = firstConstant;
          ptr[5] = secondConstant; //Write after 128-bits asigned for first constant

          You should bind this resource with "cbo". Hope it helps.

            • Constant buffers naming convention in Brook kernel generated IL
              jean-claude

              Hey thanks,

              So just to see if my understanding is correct:


              kernel void k_trial(out float A<>, float BO<>, float B1<>, float C0, float C1, float C2 ) {
                 A = B0*C0 + B1*C1 - C2;
              }


              // Allocate 4 float constants
              // Note: Resource width is set to 1 (equivalent of 4 float constants) (is this safe?)
              // ----------------------------------------------------------------------------------
              CALresource constants_Res;
              calResAllocLocal1D(&constants_Res, device, 1, CAL_FORMAT_FLOAT_4, 0);

              // Set constant values
              // -------------------
              calResMap(&ptr, constants_Res);
              ptr[0] = Constant_0;
              ptr[1] = Constant_1;
              ptr[2] = Constant_2;
              ptr[3] = Constant_3; // won't be used
              calResUnmap(constants_Res);


              // Binding to ctx
              // --------------
              CALmem constants_Mem = 0;
              calCtxGetMem(&constants_Mem, ctx, constants_Res)


              // Binding to kernel constant pin
              // --------------------------------------
              CALname constants_name = 0;
              calModuleGetName(&constants_name,  ctx, module_k_trial, "cb0");
              calCtxSetMem(ctx, constants_name, constants_Mem);

              ...

              then execute kernel

               

              Right?

                • Constant buffers naming convention in Brook kernel generated IL
                  gaurav.garg

                  Resource width should be same as number of constants. Also, notice the assignment of mapped pointer.

                  CALresource constants_Res;
                  calResAllocLocal1D(&constants_Res, device, 3, CAL_FORMAT_FLOAT_4, 0);

                  // Set constant values
                  // -------------------
                  calResMap(&ptr, constants_Res);
                  ptr[0] = Constant_0;
                  ptr[
                  4] = Constant_1;
                  ptr[
                  8] = Constant_2;
                  calResUnmap(constants_Res);

                    • Constant buffers naming convention in Brook kernel generated IL
                      jean-claude

                      Got it!

                      So apparently there is no way to "trick" the allocator by assigning a FLOAT_4 resource, and then use FLOAT_1 slices in it?

                       

                      But then when you use your second approach, you get FLOAT_1 constant  items, isn't it ?   ie.

                      kernel void test(float a[1024], float b[16][16], out float c<> {...}

                      for constant vector a (ie cb0), I would then assume that the upfront CAL related resource allocation would be

                      calResAllocLocal1D(&constant-a, device, 1024, CAL_FORMAT_FLOAT, 0);

                      or the equivalent to ensure alignment

                      calResAllocLocal1D(&constant-a, device, 1024/4, CAL_FORMAT_FLOAT_4, 0);

                       

                       

                       

                       

                        • Constant buffers naming convention in Brook kernel generated IL
                          gaurav.garg

                          Unfortunately, CAL requires 128-bit allocation for constant buffers irrespective of type of constant array.

                          So, you need to allocate resource like this-

                          calResAllocLocal1D(&constant-a, device, 1024, CAL_FORMAT_FLOAT_4, 0);

                          if you are using constant array of type float, float2 ot float4 with 1024 elements.

                            • Constant buffers naming convention in Brook kernel generated IL
                              jean-claude

                              Ok, sorry but  I'm sill getting somewhat confused

                              Assume I have 64 float constants from A0 to A63

                              Assume my kernel should add constant A6 to input stream.

                              The resource allocation is done through:
                              calResAllocLocal1D(&constant_A, device, 64, CAL_FORMAT_FLOAT_4, 0);

                              which actually contains 64*4 float values


                              The correct kernel is then (with A being assigned to cb0)

                              kernel void test ( out float C<>, float A[64], float B<> ) {  C = B + A[5]; }

                              in other words the index in A is a 128bits index...


                              So my question is what would  the following kernels mean and do?

                              (1)   kernel void test ( out float C<>, float4 A[64], float B<> ) { C = B + A[5].x; }

                               

                              // and moreover the one to write A constants from the 64 first elements of a D stream

                              the CALoutput domain being {0,0,64,1}

                              (2)   kernel void set_A ( out float A<>,  float D[] ) {

                                     int pos = instance().x;

                                     A = D[pos];

                              }

                              or should it be

                              (3)   kernel void set_A ( out float4 A<>,  float D[] ) {

                                     int pos = instance().x;

                                     A.x = D[pos];

                              }

                                • Constant buffers naming convention in Brook kernel generated IL
                                  jean-claude

                                  By the way, additionally it seems that Brook compiler is having difficulty with this:

                                  (1) kernel void test1 ( float a[128], float b<>, out float c<> { c = a[33] + b;}

                                  compiles properly, no problem

                                   

                                  but if the order of parameter is changed...


                                  (2) kernel void test2 ( out float c<>, float b<>, float a[128]) { c = a[33] + b;}

                                  NOTICE: Parse error
                                  While processing <buffer>:88
                                  In compiler at zzerror()[parser.y:112]
                                    message = parse error

                                  ERROR: Parse error. Expected declaration.
                                  While processing <buffer>:88

                                    • Constant buffers naming convention in Brook kernel generated IL
                                      jean-claude

                                      And this compiles properly too ...

                                      kernel void test2 ( out float c<>, float b<>, float a[128], int d) {
                                         c = a[33] + b + d;
                                      }

                                      Sounds like Brook compiler doesn't like fixed sized constant definition being the last parameter...

                                        • Constant buffers naming convention in Brook kernel generated IL
                                          gaurav.garg

                                          Thanks for pointing out the compilation issues. I have filed a bug for it. Regarding your previos question-

                                          Both the kernels works the same way.

                                          kernel void test ( out float C<>, float A[64], float B<> ) {  C = B + A[5]; }

                                          kernel void test ( out float C<>, float4 A[64], float B<> ) { C = B + A[5].x; }

                                          When you call a kernel with constant buffer, a pointer of constant array is passed, not a stream. you need to pass a float and float4 pointer in these kernel. Runtime will always internally allocate a float4 CAL buffer and copy data such that it maintains 128-bit straddling for each element. Hope it helps in understanding.

                          • Constant buffers naming convention in Brook kernel generated IL
                            gaurav.garg

                             

                            Originally posted by: jean-claude  (2) is this to say that for n constants the generated IL would be dcl_cb cb0[n]?     but then when are cb1, ..., cb14 used?

                             

                             

                             

                            You can use multiple constant buffers through Brook+ if you declare constants with their size in square bracket.

                            kernel void test(float a[1024], float b[16][16], out float c<>; //It will allocate two constants buffers of size 1024 and 256 = (16x16)