Originally posted by: RyFo18 I have been utilizing the clGetEventProfilingInfo function within my Host code to get timing information related to kernels. I'm experiencing some sort of bottleneck within my code and I'd like to investigate it further. Ideally, I would like to put something within my OpenCL code that would allow me to measure the time it takes (with micro or nanosecond precision) to execute various parts of my kernel.
I am currently testing a CPU implementation. I have utilized a timer.h file in the past for doing something similar within my C++ code, but this is a C++ based file and I don't believe I can use it within my OpenCL code. Does anyone have any suggestions as to how I can time various portions of my kernel code (not the whole kernel itself)?
There is no direct way to measure time for various portions of Kernel in OpenCL. but you can do this as follows
1. Divide kernel into parts with preprocessor defs like STAGE_1 ... STAGE_N
2. compile code and run kernel with preprocessors one by one.
I hope this helps you.
Timer code you can use in your OpenCL runtime code but not in your kernel code if you have C++ compiler.
No standard headers are supported in Kernel code.