1 Reply Latest reply on Nov 10, 2009 12:09 AM by youplaboom

    scatter/gather array:  how to copy to/from graphical memor

    snef

      If you creat a stream (for input) you have to perform a read to fill it up. The read will copy from main (=host) memory over the PCIe to the graphical memory.

      How ever if you creat a scatter/gatther array as in float4 ar[1024] and fill it up in main memory. How do you create that array in graphical memory and how do you copy it?

      I think that you can just use that array in the kernel call and  brook+ will under the hood do the copy.

      What happens if you first use a gather/scatter array as output for kernel1 and then use that array as input for kernel2. Is there a copy to main memory performed?

      Where/how do you declare that array: I do't need it in main memory.

       

      Any help appreciated.

       

      Sven

       

        • scatter/gather array:  how to copy to/from graphical memor
          youplaboom

          In the kernel definition, if you specify the number of elements in the gather array, as in float4 ar[1024], then it is a constant buffer, not a gather stream, although it's usage is the same in your kernel: read only.

          From your C program, the array is passed as constants, the same way a single (scalar) constant would be passed to the kernel call. Thus you don't need to declare a stream and load the array into it.

          Now if you don't specify the number of elements in the kernel definition, as in float4 ar[], then it's a gather stream, which needs to be allocated and read in your C program.

          For scatter (output) arrays, you cannot specify the number of elements, since obviously it cannot be a constant array. So it's a regular stream which needs to be loaded.

          In you example, you use float4 ar[] in both kernels 1 and 2, and declare a stream in you C program. If that array is used only as passing mechanism between both kernels and you don't need its values in you C program, you don't need to declare a corresponding C array or use the stream read/write.