cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

niravshah00
Journeyman III

Stream compaction

I am running my kernels in tiles with each kernel of size 8192x90x90

so my output stream is of this size now and only few of those would have results and the result would have a values 0 (that how i figure which one has results)

Now is there any way i can reduce the stream and not have to filter the whole array (i read the output stream on host code in an array)

0 Likes
5 Replies
niravshah00
Journeyman III

Is there no way of doing stream compaction ??

0 Likes

You can write a kernel that takes as input your stream and reduces its dimensions (stream user guide page A-9)

 

f() : function void kernel do_reduction(float input<>, reduce float output<>) { if(input > 10.0) { output += f(a); } }

0 Likes

Originally posted by: geekmaster You can write a kernel that takes as input your stream and reduces its dimensions (stream user guide page A-9)

 

 

 

Ya i saw that but that wont work since i am not allowed to use instance() in reduce kernel and i need the location from the input stream cos that tells me what the solution is .

0 Likes

hey i have an idea

counter = 0;

kernel void search_forsolution(int input<>, int counter, out int3 solution1, out int3 solution2, ........, out int3 solution8)

{

     int3 point = indexof(input).xyz;

     if(any(point))

     {

            if(couter == 1)

            {

                     solution1 = point;

            }

            else if(counter == 2)

            {

                         ...................

             }

             counter++;

      }

}

Hope it helps

0 Likes

Originally posted by: geekmaster hey i have an idea

 

counter = 0;

 

kernel void search_forsolution(int input<>, int counter, out int3 solution1, out int3 solution2, ........, out int3 solution8)

 

{

 

     int3 point = indexof(input).xyz;

 

     if(any(point))

 

     {

 

            if(couter == 1)

 

            {

 

                     solution1 = point;

 

            }

 

            else if(counter == 2)

 

            {

 

                         ...................

 

             }

 

             counter++;

 

      }

 

}

 

Hope it helps

 

but the solutions might be more than 8 secondly you cannot chage the function arguments that is not allowed in Brook+ and if i make a local copy of counter each thread will have its own copy i need a counter to be in global space and accessible by all thread and atomic operation on it for synchronization but sadly Brook+ does not support atomic OpenCL does

0 Likes