AnsweredAssumed Answered

CL-GL Interop fastest way to synchronize?

Question asked by george72 on Jun 18, 2018
Latest reply on Aug 23, 2018 by dipak

We are using OpenCL on Windows as part of a proprietary game-engine where we use the CL-GL interop functionality to communicate between the simulation and the rendering engine. Our core loop currently executes the following steps:


  1. Acquire GL objects
  2. Run simulation using OpenCL
  3. Release GL objects
  4. Ensure OpenCL operations are finished
  5. Render using OpenGL
  6. Swap Buffers


Currently, our main bottleneck is step 4: "Ensure OpenCL operations are finished". In contrast to nVidia's drivers, AMD's do not support implicit OpenCL/OpenGL synchronization so we need to synchronize explicitly by having the CPU wait for the OpenCL kernels to finish before starting to submit our rendering commands. Needless to say, this becomes a severe performance bottleneck if the load on the GPU is increased, causing the wait to be an (unacceptable) 8 - 20 ms on a  Radeon R9 290X.


The "official" advice is to use OpenGL's ARB_cl_event extension, but that extension is not supported on any driver we tested with. Is there some other (undocumented) way of achieving this synchronization in a faster way using AMD cards?