FangQ

Pass value from device to host during kernel execution

Discussion created by FangQ on Mar 19, 2018
Latest reply on Mar 20, 2018 by dipak

I would like to print a progress bar for my OpenCL code during the kernel execution. My CUDA equivalent of this code was able to achieve this using pinned memory, I was trying to implement the same using CL_MEM_ALLOC_HOST_PTR and clEnqueueMapBuffer, but the result is quite strange.

 

here is a snipet of the relevant code

 

void host_function(){


     cl_uint *progress=NULL;
     cl_mem *gprogress;


     gprogress=(cl_mem *)malloc(1*sizeof(cl_mem));
     
     // define a host_ptr buffer, alloc in the pinned memory


     OCL_ASSERT(((gprogress[0]=clCreateBuffer(mcxcontext,(CL_MEM_READ_ONLY | CL_MEM_ALLOC_HOST_PTR), sizeof(cl_uint),NULL,&status),status)));


     // initialize the pinned memory buffer
     progress = (cl_uint *)clEnqueueMapBuffer(mcxqueue[0], gprogress[0], CL_TRUE, CL_MAP_WRITE, 0, sizeof(cl_uint), 0, NULL, NULL, NULL);
     *progress=0;
     clEnqueueUnmapMemObject(mcxqueue[0], gprogress[0], progress, 0, NULL, NULL);


     OCL_ASSERT((clSetKernelArg(mcxkernel[i],10, sizeof(cl_mem), (void*)(gprogress))));


     // launch kernel
     OCL_ASSERT((clEnqueueNDRangeKernel(mcxqueue[devid],mcxkernel[devid],1,NULL,&gpu[devid].autothread,&gpu[devid].autoblock, 0, NULL, NULL)));


     if((param.debuglevel & MCX_DEBUG_PROGRESS)){
             // after launching the kernel, check progress by reading gprogress[0]


             progress = (cl_uint *)clEnqueueMapBuffer(mcxqueue[0], gprogress[0], CL_FALSE, CL_MAP_READ, 0, sizeof(cl_uint), 0, NULL, NULL, NULL);
             do{
                 ndone = *progress;
                 MCX_FPRINTF(cfg->flog,"progress=%d\n",ndone);
             }while (ndone < maxcount);
             clEnqueueUnmapMemObject(mcxqueue[0], gprogress[0], progress, 0, NULL, NULL);
     }
     
     OCL_ASSERT((clFinish(mcxqueue[devid])));






}

 

inside the kernel, I incremented gprogress[0]. I was hoping that do/while loop could read out the updated value to progress, and print out during kernel execution.

 

However, what I see is that it keeps printing progress=0 at the begining, sometimes after 10 seconds ish, it prints a big jump in progress value, but stay the same for another 10 sec or more. Sometimes it just keep on printing without exiting the while loop (because it never reaches the expected maxcount).

 

can someone tell me if this is the correct way to implement a progress bar in OpenCL? how can I make it work?

 

thanks

Outcomes