Hey,
I am just starting with OpenCl programming and as a starting project I took an algorithm that is fairly simple: gaussian filter.
Now I wanted to test out the profiling function so:
1 i added profiling in the command queue
commandQueue = clCreateCommandQueue(context, devices[0], CL_QUEUE_PROFILING_ENABLE, NULL);
2. made a call back function that can read and output the profile values:
void CL_CALLBACK eventCallback(cl_event ev, cl_int ev_status, void *user_data)
{
int evID = (long) user_data;
cl_int errNum;
cl_ulong ev_start_time = (cl_ulong) 0;
cl_ulong ev_stop_time = (cl_ulong) 0;
size_t return_bytes;
double run_time;
printf("PROFILING: Event callback %d %d ", (int) ev_status, evID);
//read back the command event queued counter value
errNum = clGetEventProfilingInfo(ev, CL_PROFILING_COMMAND_QUEUED, sizeof(cl_ulong), &ev_start_time, &return_bytes);
if (errNum!=CL_SUCCESS)
{
printf("ERROR: get CL_PROFILING_COMMAND_QUEUED did not succeed");
}
//read back the command event end counter value
errNum = clGetEventProfilingInfo(ev, CL_PROFILING_COMMAND_END, sizeof(cl_ulong), &ev_stop_time, &return_bytes);
if (errNum!=CL_SUCCESS)
{
printf("ERROR: get CL_PROFILING_COMMAND_END did not succeed");
}
//calculate the run time from start and stop time
run_time = (double) (ev_stop_time - ev_stop_time);
//output the result
printf("\n Kernel run time %f secs \n", run_time * 1.0e-9);
}
3. add the event to the kernel to call the callback function:
errNum = clEnqueueNDRangeKernel(commandQueue, kernel1, 2, NULL, globalWorkSize, localWorkSize, 0, NULL, &filter_order_event);
errNum = clSetEventCallback(filter_order_event, CL_COMPLETE, &eventCallback,(void *)ID);
Now i keep on getting 0s as a result and no actual values ... even though i see the GFX working for some time on the kernel. i also added clfinish(commandQueue) but same result. Am i missing something?
Thanks for helping,
kind regards,
Tim
Solved! Go to Solution.
btw I have found the issue:
run_time = (double) (ev_stop_time - ev_stop_time);
should become
run_time = (double) (ev_stop_time - ev_start_time);
very stupid mistake 😉 ...
btw I have found the issue:
run_time = (double) (ev_stop_time - ev_stop_time);
should become
run_time = (double) (ev_stop_time - ev_start_time);
very stupid mistake 😉 ...