5 Replies Latest reply on Feb 5, 2010 6:29 AM by gaurav.garg

    Creating Buffers

    jajce85

      Hi, I was wondering how does ATI implement the CL_MEM_ALLOC_HOST_PTR flag in the clCreateBuffer() function (i.e. where does it allocate the buffer?). This is supposed to allocate the memory on the host, but Iam guessing that as some stage it has to  be transferred to the device when the kernel executes.

      Is this correct? If so how fast/slow is this implementation, and when should it be used?

       

      Thanks

        • Creating Buffers
          genaganna

           

          Originally posted by: jajce85 Hi, I was wondering how does ATI implement the CL_MEM_ALLOC_HOST_PTR flag in the clCreateBuffer() function (i.e. where does it allocate the buffer?). This is supposed to allocate the memory on the host, but Iam guessing that as some stage it has to  be transferred to the device when the kernel executes.

           

          Is this correct? If so how fast/slow is this implementation, and when should it be used?

           

           



          Presently CL_MEM_ALLOC_HOST_PTR is a basic implementation. Presently performance of CL_MEM_ALLOC_HOST_PTR and CL_MEM_USE_HOST_PTR is almost same. You can expect optimized implementation in upcoming releases.

            • Creating Buffers
              nou

              so when i use CL_MEM_ALLOC_HOST_PTR it will create pinned memory?

              when i use *_HOST_PTR when is data consistent when i execute kernel which write some data to this data? and becuase it must probably transfer data acros bus after each kernel execution isnt faster create mem object without *_HOST_PTR flag?

                • Creating Buffers
                  genaganna

                   

                  Originally posted by: nou so when i use CL_MEM_ALLOC_HOST_PTR it will create pinned memory?

                  Presently pinned memory is not used

                   

                  when i use *_HOST_PTR when is data consistent when i execute kernel which write some data to this data? and becuase it must probably transfer data acros bus after each kernel execution isnt faster create mem object without *_HOST_PTR flag?

                   

                         I am not clear what you are asking.

                    • Creating Buffers
                      nou

                      1. which flag will need to use that I got pinned memory? when it will be supported.

                      2. when i specify CL_MEM_USE_HOST_PTR or CL_MEM_ALLOC_HOST_PTR and use this buffer as output from kernel is not overhead copying from GPU to host memory in this case?

                      3. and when it is copied from GPU to host memory? because OpenCL spec say that it can be cached. when i use clWaitForEvents().

                        • Creating Buffers
                          gaurav.garg

                           

                          1. which flag will need to use that I got pinned memory? when it will be supported.


                          I think both CL_MEM_USE_HOST_PTR and CL_MEM_ALLOC_HOST_PTR allocate pinned memory on host. The only advantage with CL_MEM_USE_HOST_PTR is that this way you can avoid an extra memcpy from host application pointer to CL host buffer and vice-versa.

                          2. when i specify CL_MEM_USE_HOST_PTR or CL_MEM_ALLOC_HOST_PTR and use this buffer as output from kernel is not overhead copying from GPU to host memory in this case?

                          3. and when it is copied from GPU to host memory? because OpenCL spec say that it can be cached. when i use clWaitForEvents().

                          I am not sure how this is implemented currently, but kernels too have direct access to part of host memory. In CAL it used to be called PCIe host memory.