Archives Discussions

snef · ‎11-09-2009

If you creat a stream (for input) you have to perform a read to fill it up. The read will copy from main (=host) memory over the PCIe to the graphical memory.

How ever if you creat a scatter/gatther array as in float4 ar[1024] and fill it up in main memory. How do you create that array in graphical memory and how do you copy it?

I think that you can just use that array in the kernel call and brook+ will under the hood do the copy.

What happens if you first use a gather/scatter array as output for kernel1 and then use that array as input for kernel2. Is there a copy to main memory performed?

Where/how do you declare that array: I do't need it in main memory.

Any help appreciated.

Sven

youplaboom · ‎11-10-2009

In the kernel definition, if you specify the number of elements in the gather array, as in float4 ar[1024], then it is a constant buffer, not a gather stream, although it's usage is the same in your kernel: read only.

From your C program, the array is passed as constants, the same way a single (scalar) constant would be passed to the kernel call. Thus you don't need to declare a stream and load the array into it.

Now if you don't specify the number of elements in the kernel definition, as in float4 ar[], then it's a gather stream, which needs to be allocated and read in your C program.

For scatter (output) arrays, you cannot specify the number of elements, since obviously it cannot be a constant array. So it's a regular stream which needs to be loaded.

In you example, you use float4 ar[] in both kernels 1 and 2, and declare a stream in you C program. If that array is used only as passing mechanism between both kernels and you don't need its values in you C program, you don't need to declare a corresponding C array or use the stream read/write.

Archives Discussions

scatter/gather array: how to copy to/from graphical memor