3 Replies Latest reply on Jun 26, 2008 10:27 PM by rahulgarg

    Extended calMemCopy feature request

      I was wondering if its possible to get a more generic calMemCopy operation :

      Consider a resource 2d resource A and a 2d resource B. It is currently
      not possible to copy pieces of A into B. It would be much better if an
      extended version of calMemCopy was provided that can copy a
      rectangular block from an arbitrary starting position inside A to an
      arbitrary position inside B. Moreover it would be nice if it didnt
      actually look at the formats and just copied the data over as is as long as the sizes
      provided are appropriate.

      Currently, the only way to copy data (without involving CPU) is to
      copy entire resource A to B which can only be done if size and format
      of A and B match. The other option is to map, involve the CPU, then
      unmap but thats undesirable on several levels.

      This is very problematic in many cases. For example, take the global buffer. If I want to use the global buffer to store 2 matrices, and if I want to copy only 1 of those matrices to the CPU at some point of time, then thats not possible.

      As another motivation, CUDA does offer a very flexible memcpy

      I think a more flexible calMemCopy is very very desirable.
        • Extended calMemCopy feature request
          Request filed! :-)

          • Extended calMemCopy feature request

            In the meantime, there is an easy workaround for this. You can use the CAL Domain Parameters extension to define a custom interpolant that covers the rectangular block of the input resource you want to copy. Then, set the domain to an equal-sized block on the output resource. Use an "identity" kernel (one that just copies its input to its output) to do the transfer. This should not involve the CPU at all.

            One caveat is that the CAL Domain Parameters extension is not yet documented anywhere (as far as I can tell, anyways), so you will have to look at cal_ext.h and dig through the Brook+ source code to learn how to use it. This extension is how Brook+ implements its stream.domain() functionality. It's extremely useful and I'm beat as to why it is not documented anywhere.

            As far as working with different formats, you can extend the "identity" kernel to do the type conversion during the copy.

            Happy hacking,



            • Extended calMemCopy feature request
              Thanks for the info on domain parameters !
              Will look into it.