I created a simple CL/GL sharing app, and it didn't seem to be matching the performance of the demo provided in the SDK.
After some testing, I realised that the performance of acquiring + releasing the GL buffer object was impacted a lot by when the GL buffer object was created.
If the GL buffer object was created before the CL Context, the performance was MUCH lower. I would get 40-50 FPS instead of 140-150 FPS when the GL buffer object was created after the CL Context.
// ------- slow --------------- glGenBuffers(1, &buf); glBindBuffer(...); glBufferData(..., GL_DYNAMIC_DRAW); context = clCreateContext(...); // from GL Context clCreateFromGLBuffer(context, CL_MEM_WRITE_ONLY, ...); // ------- fast --------------- context = clCreateContext(...); // from GL Context glGenBuffers(1, &buf); glBindBuffer(...); glBufferData(..., GL_DYNAMIC_DRAW); clCreateFromGLBuffer(context, CL_MEM_WRITE_ONLY, ...);