since 13.4 and 13.5 beta, my OpenCL GPU program consumes ~80% of one CPU core while in clFinish, waiting for a string of GPU kernels and a final clEnqueueReadBuffer. My main thread looks like this
and is using 0.01% CPU.
However, there is another thread:
that is using ~19% CPU (76% of a core). The upper part of the stack changes - it is not stuck in clIcdGetPlatformIDsKHR.
When using a CPU-hungry program to consume almost all CPU and starve my program, then this thread's CPU load goes back to almost nothing, but the GPU is not fed very well and GPU load is very jumpy between 70-98%. GPU load would normally be pegged at 100%.
When rolling back to cat13.3, the program's total CPU load is at ~0.1-0.3%, and running a CPU-hog has almost no effect on my program.
Is there anything special to be done on the newer drivers to make them leave the CPU alone? Is there any setting to get the CPU-behavior of the previous drivers?
My environment: HD5770+Phenom II X4 955, Win7-64. I got reports that the same happens with an APU and the integrated 6550D (also Win7-64).
Note: Making the final clEnqueueReadBuffer synchronous instead of the final clFinish does not change the CPU load.
... and could someone please give me a hint how I can get a proper code formatting in this forum? Thanks a lot!