recently I've run an opencl workload which executes concurrent kernels, and realized that execution time in GPU performance counter and Application Timeline Trace doesn't fit.
is there any difference in implementation of getting running time information between GPU performance counter and Application Timeline Trace?
"doesn't fit" means that there's a huge difference in running time.
it almost shows totally different results.
which one should I believe?
I'm using CodeXL 1.9 on Ubuntu 12.04
screenshots are attached.