I am experiencing unexpected behavior when using clWaitForEvents. Now this may be related to using CodeXL to examine the operation of my code however I wanted to seek some advice. As I understand it clWaitForEvents should return once the events passed as its parameters are complete. In my code I am running a set of OpenCL kernels in a loop. I am creating an event for the final kernel in the loop and then queuing a number of additional sets of kernels (in this case ~10 in total with the event on the last kernel of the first set). However as can be seen the following screen grabs from CodeXL clWaitForEvents is demonstrating the same behavior as clFinish and waiting for all the queued kernels to complete.
Here I have highlighted the kernel on which the event is set, CodeXL shows it completing a long time before the clWaitForEvents on the event attached to it.
Here we see the call to clWaitForEvents for the event shown in the previous image. However as can be seen the call to clWaitForEvents seems to behave like clFinish and wait for all queued commands to complete.
Is this the expected behavior, an artifact of viewing the timeline in CodeXL or something else?