Profiling my OpenCL application (linked with OpenCL.lib from APP SDK v2.9) running on R9 290X under Windows 7 x64, the GPU application trace timeline in CodeXL 1.3.4590.0 looked very strange. Apart from my 3 command queues in my single context, there were also many other contexts with weird numbers, some of them even with queues (also weird numbered). On the other hand, many of my OpenCL function calls were not marked in the timeline view, so such profiling was a bit useless for me.
I've stripped my application to bare minimum which still causes the CodeXL to show this strange timeline. This minimum is a bit surrealistic: it still creates my single context with 3 queues but then only repeatedly makes one of the queues wait for another one, selecting different pair of queues in each iteration. No kernels launched, no memory allocations, no memory copies. Of course, my real application contains many kernels and memory transfers as well, but they are not needed to make the CodeXL show the weird timeline. The minimal code is attached. Screenshot of the timeline is attached too.
Can anyone verify this behavior? And if verified, is this problem of CodeXL or AMD OpenCL runtime or a problem of my PC or my application? My application uses clEnqueueMarker-clEnqueueWaitForEvents-clReleaseEvent idiom to make one queue wait for another without blocking the calling CPU thread. Is this correct usage?