4 Replies Latest reply on Jun 8, 2011 7:47 AM by himanshu.gautam

    Copying Data over to GPU

    richeek.arya
      Conceptual Question

      Hi,

      I have question regarding copyign data over to GPU, the way I am doing is is like this:

      it is a 4 step process:

      1. clCreateBuffer

      2. Get a ptr usign clEnqeueuMapBuffer

      3. Copy the data to this ptr using memcpy --- input data is supplied from the user

      4. Unmap the data

      In the end free the memory using this:

      status = clReleaseMemObject(s_re_d);

       

      Is it correct that I do not need to free s_re_p since I never allocated memory to it?? Could someone please let me know if all these steps are following OpenCL guidelines?

      with regadrs,

      Richeek

      cl_mem s_re_d; size = nT * nSym * sizeof(float); total_size += size; s_re_d = clCreateBuffer(context, CL_MEM_READ_WRITE, size, NULL, &status); if(status != CL_SUCCESS) { mexPrintf("Error: Setting kernel argument. \n"); return; } float *s_re_p; s_re_p = (float *)clEnqueueMapBuffer(commandQueue,s_re_d,CL_FALSE,CL_MAP_WRITE,0,size,0,NULL,NULL,&status); if(status != CL_SUCCESS) { mexPrintf("Error: clEnqueueMapBuffer \n"); return; } memcpy(s_re_p, s_re, size); /* Load the data back on the GPU */ status = clEnqueueUnmapMemObject(commandQueue,s_re_d,(void*)s_re_p,0,NULL,&ev); if(status != CL_SUCCESS) { mexPrintf("clEnqueueUnmapMemObject() failed\n"); return; } status = clWaitForEvents(1, &ev); if(status != CL_SUCCESS) { mexPrintf("clEnqueueUnmapMemObject() Release failed s_re_d\n"); return; }

        • Copying Data over to GPU
          himanshu.gautam

          AFAIK the method is correct for read_write buffers.

          You don't need to call any free(s_re_p) . It will be released when unmap is called.

            • Copying Data over to GPU
              richeek.arya

              Thanks Himanshu for checking it.

              On the system which has NVIDIA GTX 260 with Windows 7, 64 bits I had to do this too:

              s_re_p = NULL;

              Else CPU physical memory usage is shooting up as I run the program in a loop.

              However, as I mentioned earlier with ATI Radeon 5450 and Windows 7, 64 bits I did not  make s_re_p = NULL; wheh I return and memory usage is still bounded.

              I can not explain this behavior since host side memory should be allocated by the Operating System which in this case is the same Windows 7. Or am I wrong and in the case of OpenCL program this is done by the SDK.

              If this is SDK dependent then this may explain it since NVIDIA and AMD has differnet SDKs.

              But anyways making the pointer NULL when you leave is a good practice I suppose. So I will stick to it.

              Thanks,

              Richeek

                • Copying Data over to GPU
                  richeek.arya

                   

                  Originally posted by: richeek.arya

                   

                  On the system which has NVIDIA GTX 260 with Windows 7, 64 bits I had to do this too:

                   

                  s_re_p = NULL;

                   

                  Else CPU physical memory usage is shooting up as I run the program in a loop.

                   

                   

                   

                  Actually I take the above statement back. Memory usage again got increased after some iterations and when I rebooted the system program started failing at clCreateContext() with error code -6 (CL_OUT_OF_HOST_MEMORY). It is only on NVIDIA's hardware. Then I changed my code to CUDA and everything start workign perfectly fine.