Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Adept I

clEnqueueCopyBuffer performance bug - 4bytes in 8ms


As I am currently investigating performance variance for one of my clients, it seems that the root cause is a very large variance and slowdown for the clEnqueueCopyBuffer.

Attached is a screenshot where 4bytes copying on the GPU consumes 8ms. That's obviously a performance bug.

And it's not happening on the beginning of the processing so it's not related to any kind of warm up.

AMD Bug.jpg


Tomer Gal, CTO at OpTeamizer

3 Replies

Is this just the first access of these 2 buffers? How do the rest of CopyBuffers look like?

CreateBuffer seems to be opportunistic and there could be some buffer initialization going on.

Finally, a question that I had all along these performance threads. Could any other processes run at the same time? If this is a display card, could it be that display rendering is responsible for some of these performance variations?


Hi Nibal,

When we create the buffers we also enqueue a write to them to make sure they are actually created before we start using them, so this is not the case of lazy initialization.

As for other processes running, that's not the case. That's an 8 core machine, the only thing running is the process running the OpenCL host code, no other time consuming process is running.

As for a display card, that's also not the issue. The display is using the Intel iGPU while the AMD GPU is used solely for OpenCL compute.


Tomer Gal


Hi Tomer,

Thanks for the clarifications. Nicely controlled environment.

Have you verified profiling from your host side? It doesn't have CodeXL's resolution, but time needed to complete CopyBuffer should equal sum

of queueing and execution in CodeXL's profiler.

I have also seen weird execution times in my programs under CodexL profiler, even violating single queue prioritization.

I have even seen event completion before kernel has even started, so I assumed that this is a profiler issue.

(CodeXL is *very* buggy. Have given up raising tickets about it :-()

One last question: Is that the only CopyBuffer that looks like that, or are there more?