I am measuring the performance of GPU on A10-7850K APU. I found that it was hard to get the real throughput of an application on GPU, because the CPU thread that is measuring the GPU execution time seems to be frequently scheduled by the OS when waiting for GPU to complete.
I measure the average latency of a simple application on GPU, which throughput is 7 MOPS ( million operation per second). After I added a while(1) loop in the CPU thread, the thread seems to get busy and the GPU throughput is measured as 12 MOPS. I have also tried to use clGetEventInfo in a while loop, but it doesn't help.
Is there any way to measure the precise execution time of GPU?