cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

FangQ
Adept I

help with event timing

clGetEventProfilingInfo always gives 0 elapsed time

I followed some examples to use cl_event to track the elapsed time of my kernel, however, the output is always 0. I am wondering if anyone can point out where I used it wrong?

in my timing.c unit, I defined:

#include
static cl_ulong timerStart, timerStop;
cl_event kernelevent;

unsigned int GetTimeMillis () {
  float elapsedTime;
  clGetEventProfilingInfo(kernelevent, CL_PROFILING_COMMAND_START,
                        sizeof(cl_ulong), &timerStart, NULL);
  clGetEventProfilingInfo(kernelevent, CL_PROFILING_COMMAND_END,
                        sizeof(cl_ulong), &timerStop, NULL);
  elapsedTime=(timerStop - timerStart)*1e-6;
  return (unsigned int)(elapsedTime);
}

in my host.cpp, I invoked the event as in the attached code below.

thank you for any hints or suggestions.

 

extern cl_event kernelevent; ... mcx_assess(clEnqueueNDRangeKernel(commands,kernel,1,NULL,(size_t*)(&(cfg->nthread)), (size_t*)(&(mcblock)), 0, NULL, &kernelevent)); mcx_assess(clEnqueueReadBuffer(commands,gfield,CL_TRUE,0,sizeof(cl_float), field, 0, NULL, NULL)); fprintf(cfg->flog,"kernel complete: \t%d ms\n",GetTimeMillis()); ...

0 Likes
1 Reply
FangQ
Adept I

figured out by myself, need to add CL_QUEUE_PROFILING_ENABLE flag as the property when calling clCreateCommandQueue. After adding that flag, everything works fine now.

noticed that compiling the cl problem for GPU is quite slow on ATI cards, it took a few seconds; when running on CPU backend, the compilation is instant.

0 Likes