I see a strange behavior on my GPU. I got
- AMD Fusion Processor (E350 + integrated HD 6310)
- ubuntu linux
- OpenCL 1.2 AMD-APP (923.1)
When I run a kernel on the GPU, the kernel execution time (taken from clGetEventProfilingInfo) is allways half of the execution time, that I get with gettimeofday on the host. It can't be a setup time because it is always half of the time measured on the host. The rest of the time the GPU seems idle.
I transfer data before measuring (with a blocking clEnqueueWriteBuffer). So data transfer shouldn't be the reason. I tried several kernels, having the same strange behaviors. When I run the kernel on the CPU, profiling time is equal to the gettimeofday time (no problem there). I tried clFinish, clWaitForEvents and waiting and checking on the host with clGetEventInfo. All with the same result.
I'll attach a screenshot of my resent tests with CodeXL. The kernel is called "nest". Very interesting is, that the idle time is always before the actual execution. It's always the same time like the execution. That really strange.
I hope someone knows a solution or a workaround.
I tried that, with some strange results.
See attached a screenshot when I execute the same kernel twice before calling clWaitForEvents and a screenshot where I call a kernel, call clWaitForEvents, call the kernel again and wait again.
Very strange is the first case, where the idle time before execution is exactly as long as both kernels execute afterwards.
But I don't have 2 threads on the host. I call the two kernels one after an other.
And I don't think it's a bug in CodeXL (at least not only there) because I see that same behavior with the times I get from clGetEventProfilingInfo.