If you creat a stream (for input) you have to perform a read to fill it up. The read will copy from main (=host) memory over the PCIe to the graphical memory.
How ever if you creat a scatter/gatther array as in float4 ar and fill it up in main memory. How do you create that array in graphical memory and how do you copy it?
I think that you can just use that array in the kernel call and brook+ will under the hood do the copy.
What happens if you first use a gather/scatter array as output for kernel1 and then use that array as input for kernel2. Is there a copy to main memory performed?
Where/how do you declare that array: I do't need it in main memory.
Any help appreciated.