kernel void compaction(int prefixsum<>,int input<>,out int3 output)
if(input == 1)
prefixsum is a 3d stream ,
input is again a 3d stream,
output as can be seen is 1d array
That isn't a reduction kernel, it performs a single assignment, not a reduction. Reduction kernels should have the reduction keyword, a simple expression and can't use neither instance( ) nor indexof( ).
If you are sure that is what you want, note that the domain of execution (number of threads) is controlled by the output variable, not prefixsum and input. If you want to change that you should use the domain operator to force the number of threads.
Another observation, 3D streams use automatic address translation code, that may not work with some Catalyst driver versions.
Well if i cannot use instance() then how can i get the index because i m doing a stream compaction so i need the index and i also want the domain of execution to be controlled by input/prefixsum.
The idea is the input stream has 1 and 0's then i do prefix sum on it .
Then the reduction kernel to get the a compact stream.
I just need those index of input stream for which there is 1
I'm not sure to understand how your kernel works. But looks like you should use the "domain" operator to create threads on all "output" elements. You have some information in the user guide and I think there is also an example in the SDK samples directory.