About waiting time.

Discussion created by ztatsuch on Jun 3, 2011
Latest reply on Jun 4, 2011 by ztatsuch


Below is a pseudo-code that I tried.

-----------------------Pseudo code ---------------------------------

tm1 = gettimeofday();

 clEnqueueNDRangeKernel (que, kernel, 1, NULL,&pe_size, &group_size, 0, NULL, &event ); 

 clWaitForEvents( 1 , &event );

 clGetEventProfilingInfo ( event, CL_PROFILING_COMMAND_START, sizeof(cl_ulong), &tc1, NULL );

 clGetEventProfilingInfo ( event, CL_PROFILING_COMMAND_END,   sizeof(cl_ulong), &tc2, NULL );

 clReleaseEvent( event );

tm2 = gettimeofday();

---------------------End code--------------------------------------

Needless to say, "tm2 - tm1 is elapse execution time measured from host CPU." and "tc2 - tc1 is net execution time measured from GPU."

and I got

Elapse execution time (tm2 - tm1) : 3 (sec)


Net Execution time (tc2 - tc1) : 0.1 (sec)

In that case,  2.9 (sec) was spent for waiting. but I don't know how 2.9(sec) was used or what used 2.9(sec).

Could anyone explain me about the waiting time ?