14 Replies Latest reply on Jul 10, 2012 12:35 PM by jeff_golds

    Problems in understanding memory objects (programming guide june 2012)

    cadorino

      Hi to everybody.
      I'm reading the latest version of the AMD APP programming guide (june 2012).
      I have a problem in deeply understanding the OpenCL Memory Objects part (sec 4.5).

      Wherever not specified, assume the kernel to be executed on a discrete GPU.

       

      1) In sec 4.5.1.2 the guide says

      Currently, the runtime recognizes only data that is in pinned host memory for operation arguments that are memory objects it has allocated in pinned host memory. For example, the buffer  argument of clEnqueueReadBuffer /clEnqueueWriteBuffer  and image  argument of clEnqueueReadImage /clEnqueueWriteImage . It does not detect that the ptr arguments of these operations addresses pinned host memory, even if they are the result of clEnqueueMapBuffer /clEnqueueMapImage  on a memory object that is in pinned host memory.

       

      Now, suppose that if I create a buffer using CL_MEM_ALLOC_HOST_PTR and I get a pointer to it using clEnqueueMapBuffer to initialize the content of the buffer directly. Does the pinning happens when I create the the buffer (pre-pinning) or when I map it?
      Is the pre-pinning mechanisms the same regardless the size of the memory area to be pinned?

       

      In addition, suppose that I use the mapped ptr as src to write into a "normal" buffer (no-flags). Since the src is not recognized as pinned, what happens? Is the src copied to another pinned memory area?

       

      Is the content of the buffer cached on the CPU when the CPU accesses it regardless the kernel access mode? (READ_ONLY, READ_WRITE, WRITE_ONLY)?

       

      2) In sec. 4.5.2 they say

      To avoid over-allocating device memory for memory objects that are never used on that device, space is not allocated until first used on a device-by-device basis.

       

      This is quite difficult to understand. Suppose I create a buffer and I do clEnqueueWriteBuffer to initialize it. Since the guide says that allocation happens at first kernel access, where data is stored before executing the kernel (or if I do not execute any kernel)?

       

      3) In table 4.2 it is said that CL_MEM_USE_HOST_PTR causes a copy when mapped. Nevertheless, in 4.5.4.1 the guide says CL_MEM_USE_HOT_PTR supports zero copy. Is it an error or there is something I do not understand?

       

      Thank you very much!