Reading the documentation it is stated that a kernel can support upto 128 input streams and 8 output streams.
However trying to run the following program, the brook/cal runtime automatically switches to CPU mode and not on the GPU
kernel void median33_k(out float output<>, float c<>,
float r<>, float l<>, float u<>, float d<>,
float ur<>, float ul<>, float dr<>, float dl<> {
....
}
So what is the actual limit for an execution on the GPU ???
8 non-gather input streams(<>), 128 gather input streams([]), 8 output streams. You can have more than 8 output streams, but Brook+ will split the kernel into multiple passes behind the scenes so that each pass has no more than 8 outputs.
Does CAL support more input streams?
nberger, you get 8 sequential input + 128 gather input streams.