11 Replies Latest reply on Apr 27, 2009 1:06 PM by gaurav.garg

    Calculating sum of first and last column, reduce kernel

    Steven_Makoviac

      Hi,

      i've the following matrix:

      4 5 8 9

      4 5 6 1

      9 6 8 7

      4 8 9 8

      now i want to calculate the sum of the first and the last column, 4+4+9+4+9+1+7+8

      I wrote a reduce kernel but i'm not sure how i can specify the domain of execution. In the doc i found the two brook::Stream member-functions domainSize and domainOffset. I'm able to calculate the sum of the first column or the last column but how i can calculate the sum of the first and the last column?

        • Calculating sum of first and last column, reduce kernel
          gaurav.garg

          domainSize and domainOffset are used to specify domain of execution, but it must be a rectangular domain. You can't specify multiple rectangles in a single domain of execution.

            • Calculating sum of first and last column, reduce kernel
              Steven_Makoviac

              O.K i see. Is ist possible that domainSize doesn't work with a reduce kernel? Because the following program always calculates the whole sum of the array not only the first column.

              reduce kernel void SumRed(float a<>, reduce float b<>

              {

              b+=a;

              }

              const int MAX = 4;

              float *zahlen = new float[MAX*MAX];

              for(int i=0;i<MAX;i++)

              {

              for(int j=0;j<MAX;j++)

              {

              zahlen[i*MAX+j] = i*MAX+j;

              cout << zahlen[i*MAX+j] << " ";

              }

              cout << "\n";

              }

              const int rank = 2;

              unsigned int dims[] = { MAX, MAX };

              ::brook::Stream<float>s1(rank,dims);

              s1.read(zahlen);

              unsigned int dimsR[] = { 1, 1 };

              ::brook::Stream<float>s2(rank,dimsR);

              SumRed.domainSize(uint4(1,MAX,0,0));

              SumRed(s1,s2);

              float summe = 0.0f;

              s2.write(&summe);

              cout << "Sum: " << summe << "\n";

               

              delete[] zahlen;

                • Calculating sum of first and last column, reduce kernel
                  gaurav.garg

                  domainSize works with reduction kernel as well, but you must use domainOffset and domainSize together.

                  SumRed.domainSize(uint4(0,0,0,0));

                  SumRed.domainSize(uint4(1,MAX,1,1));

                    • Calculating sum of first and last column, reduce kernel
                      Steven_Makoviac

                      I still get the wrong result now i use this:

                      SumRed.domainOffset(uint4(0,0,0,0));

                      SumRed.domainSize(uint4(1,MAX,1,1));

                      SumRed(s1,s2);

                      But i t still calculates the whole sum.

                        • Calculating sum of first and last column, reduce kernel
                          gaurav.garg

                          Sorry, I didn't read your complete sample. Domain of execution is w.r.t. output stream. In your case, output stream is 1X1, so it won't work.

                          Let say your output is 10X10 and input is 100X100, if you specify a domain of execution in range (0,0) to (5,5), then input elements from (0,0) to (50, 50) will be reduced into a 5X5 matrix.

                            • Calculating sum of first and last column, reduce kernel
                              Steven_Makoviac

                              Hi,

                              i've another problem, what i try to do is to run a kernel only for the first column, after that i need all the data for a another kernel call. If i run the following program the values of the first column are set to 1.0, this is what i want. But the other values are changed after the kernel call. I thought if i use domainOffset and domainSize the whole Data is copied to the GPU and the kernel is called for the domain and the other data is untouched.

                              This is the data how it looks like in the beginning:

                              0 0 0

                              0 0 0

                              0 0 0

                              this is how it should look like after the kernel call

                              1 0 0

                              1 0 0

                              1 0 0

                              this is how it looks,

                              1 0.5e 0.6e

                              1 45 42e 

                              1 4e 4e

                              kernel void SumRed(float a<>, out float b<>

                              {

                              b=1.0f;

                              }

                              const int MAX = 3;

                              float *zahlen = new float[MAX*MAX];

                              for(int i=0;i<MAX;i++)

                              {

                              for(int j=0;j<MAX;j++)

                              {

                              zahlen[i*MAX+j] =0.0f;

                              cout << zahlen[i*MAX+j] << " ";

                              }

                              cout << "\n";

                              }

                              cout << "\n";

                              const int rank = 2;

                              unsigned int dims[] = { MAX, MAX };

                              ::brook::Stream<float>s1(rank,dims);

                              ::brook::Stream<float>s2(rank,dims);

                              SumRed.domainOffset(uint4(0,0,0,0));

                              SumRed.domainSize(uint4(1,MAX,0,0));

                              s1.read(zahlen);

                              SumRed(s1,s2);

                              s2.write(zahlen);

                               

                              for(int i=0;i<MAX;i++)

                              {

                              for(int j=0;j<MAX;j++)

                              {

                              cout << zahlen[i*MAX+j] << " ";

                              }

                              cout << "\n";

                              }

                                • Calculating sum of first and last column, reduce kernel
                                  gaurav.garg

                                   

                                   

                                  SumRed.domainOffset(uint4(0,0,0,0));

                                   

                                  SumRed.domainSize(uint4(1,MAX,0,0));

                                   

                                  s1.read(zahlen);

                                   

                                  SumRed(s1,s2);

                                   

                                  s2.write(zahlen)

                                   



                                  As I see your output streeam s2 is not initialized with zeros, so it is expected that you will see uninitialized data.

                                    • Calculating sum of first and last column, reduce kernel
                                      Steven_Makoviac

                                      Yes, but the input stream is initialized with zeros. I thought all elements are transferd to the GPU only the ones are changed for which the domain is specified and and then they are transfered back.

                                        • Calculating sum of first and last column, reduce kernel
                                          gaurav.garg

                                          Stream is like a variable and it has to be initialized either by streamRead or kernel execution. Domain of execution specifies the number of instance for which this kernel is run (default is output width X height).

                                            • Calculating sum of first and last column, reduce kernel
                                              Steven_Makoviac

                                              The thing i don't untertand is why are the other values changed? In the beginning of the program the whole array is initialized by zeros and then the kernel is called for some elements but for the rest of the elements the kernel is not called. But why then the values change? If i specify a domainSize does it mean only these elemente are send to GPU or does is mean all elements are send to the GPU and the kernel is only called only for the element the domainSite is specified????????

                                              s1.read(zahlen);

                                              SumRed(s1,s2);

                                              s2.write(zahlen);

                                                • Calculating sum of first and last column, reduce kernel
                                                  gaurav.garg

                                                  Streams are buffers allocated on GPUs. The data is sent back to these buffers as soon as you call streamRead.

                                                  ::brook::Stream<float>s1(rank,dims); // Buffer0 created on GPU

                                                  ::brook::Stream<float>s2(rank,dims); //Buffer1 created on GPU

                                                  s1.read(zahlen); //Data transfer from zahlen to Buffer0 => Buffer0 initialized with zahlen data

                                                  SumRed(s1,s2); //Kernel called for a domain => Buffer1's first column is initialized with 1.0f. All other elements are still un-initialized

                                                  s2.write(zahlen); // Data transfer from buffer1 to zahlen.Both initialized and uninitialized elements are copied to zahlen

                                                  Domain of execution is asociated only with kernel execution and not with data transfer. If you want to operate on a part of stream (either during data transfer or kernel execution), you can use domain operator of stream. But, performance for domain operator might not be good.