I have a kernel that runs for about 10 seconds on the GPU, and for about 25 seconds if I run it on the CPU. If I look at the CPU usage percent in 'top' when the kernel is running on the GPU, CPU usage pegs at 100% for the process I'm running when the process is waiting at the clWaitForEvents() function.
When I run the same kernel on the CPU, the usage pegs at 200% (both cores at 100%) while waiting at the clWaitForEvents() function.
Is this something that just needs to be optimized at some point, or is there a better way to wait for the kernel to complete without the CPU overhead (is it just spinning?) when using clWaiForEvents()? Thanks!