5 Replies Latest reply on Jun 24, 2010 2:38 PM by niravshah00

    Stream compaction

    niravshah00

      I am running my kernels in tiles with each kernel of size 8192x90x90

      so my output stream is of this size now and only few of those would have results and the result would have a values 0 (that how i figure which one has results)

      Now is there any way i can reduce the stream and not have to filter the whole array (i read the output stream on host code in an array)

        • Stream compaction
          niravshah00

          Is there no way of doing stream compaction ??

            • Stream compaction
              geekmaster

              You can write a kernel that takes as input your stream and reduces its dimensions (stream user guide page A-9)

               

              f() : function void kernel do_reduction(float input<>, reduce float output<>) { if(input > 10.0) { output += f(a); } }

                • Stream compaction
                  niravshah00

                   

                  Originally posted by: geekmaster You can write a kernel that takes as input your stream and reduces its dimensions (stream user guide page A-9)

                   

                   

                   

                  Ya i saw that but that wont work since i am not allowed to use instance() in reduce kernel and i need the location from the input stream cos that tells me what the solution is .

                    • Stream compaction
                      geekmaster

                      hey i have an idea

                      counter = 0;

                      kernel void search_forsolution(int input<>, int counter, out int3 solution1, out int3 solution2, ........, out int3 solution8)

                      {

                           int3 point = indexof(input).xyz;

                           if(any(point))

                           {

                                  if(couter == 1)

                                  {

                                           solution1 = point;

                                  }

                                  else if(counter == 2)

                                  {

                                               ...................

                                   }

                                   counter++;

                            }

                      }

                      Hope it helps

                        • Stream compaction
                          niravshah00

                           

                          Originally posted by: geekmaster hey i have an idea

                           

                          counter = 0;

                           

                          kernel void search_forsolution(int input<>, int counter, out int3 solution1, out int3 solution2, ........, out int3 solution8)

                           

                          {

                           

                               int3 point = indexof(input).xyz;

                           

                               if(any(point))

                           

                               {

                           

                                      if(couter == 1)

                           

                                      {

                           

                                               solution1 = point;

                           

                                      }

                           

                                      else if(counter == 2)

                           

                                      {

                           

                                                   ...................

                           

                                       }

                           

                                       counter++;

                           

                                }

                           

                          }

                           

                          Hope it helps

                           

                          but the solutions might be more than 8 secondly you cannot chage the function arguments that is not allowed in Brook+ and if i make a local copy of counter each thread will have its own copy i need a counter to be in global space and accessible by all thread and atomic operation on it for synchronization but sadly Brook+ does not support atomic OpenCL does