1 Reply Latest reply on Feb 23, 2010 3:27 AM by FangQ

    help with event timing

    FangQ
      clGetEventProfilingInfo always gives 0 elapsed time

      I followed some examples to use cl_event to track the elapsed time of my kernel, however, the output is always 0. I am wondering if anyone can point out where I used it wrong?

      in my timing.c unit, I defined:

      [code]#include
      static cl_ulong timerStart, timerStop;
      cl_event kernelevent;

      unsigned int GetTimeMillis () {
        float elapsedTime;
        clGetEventProfilingInfo(kernelevent, CL_PROFILING_COMMAND_START,
                              sizeof(cl_ulong), &timerStart, NULL);
        clGetEventProfilingInfo(kernelevent, CL_PROFILING_COMMAND_END,
                              sizeof(cl_ulong), &timerStop, NULL);
        elapsedTime=(timerStop - timerStart)*1e-6;
        return (unsigned int)(elapsedTime);
      }[/code]

      in my host.cpp, I invoked the event as in the attached code below.

      thank you for any hints or suggestions.

       

      extern cl_event kernelevent; ... mcx_assess(clEnqueueNDRangeKernel(commands,kernel,1,NULL,(size_t*)(&(cfg->nthread)), (size_t*)(&(mcblock)), 0, NULL, &kernelevent)); mcx_assess(clEnqueueReadBuffer(commands,gfield,CL_TRUE,0,sizeof(cl_float), field, 0, NULL, NULL)); fprintf(cfg->flog,"kernel complete: \t%d ms\n",GetTimeMillis()); ...

        • help with event timing
          FangQ

          figured out by myself, need to add CL_QUEUE_PROFILING_ENABLE flag as the property when calling clCreateCommandQueue. After adding that flag, everything works fine now.

          noticed that compiling the cl problem for GPU is quite slow on ATI cards, it took a few seconds; when running on CPU backend, the compilation is instant.