6 Replies Latest reply on Jul 3, 2013 5:50 AM by himanshu.gautam

    OpenGL / OpenCL interop with shared contexes and multithreading



      I am working on a project using OpenCL / OpenGL interoperability and multi-threading. Thread1 is used just for rendering of VBO and Thread2 is used for running OpenCL kernel which process geometry stored in VBO. The kernel is called several times and I want to visualize processed mesh after each iteration. Therefore I need two things - to share openGL contexes in Thread1 and Thread2 to share the VBO and to share OpenCL / OpenGL context. The first can be achieved using wglShareLists(HLRC2, HLRC2). The second step is to create OpenCL context using sharing OpenGL context. For this I have to use the context from Thread2 - processing thread.

      As far as I understand it, the order of the commands should be as follows:

      // create contexes

      hlrc1 = wglCreateContext(m_hdc); hlrc2 = wglCreateContext(m_hdc); 

      // share resources while they are not set as current for each thread

      wglShareLists(hlrc1, hlrc2); 

      // make hlrc1 current in thread1 and hlrc2 in thread2

      wglMakeCurrent(m_hdc, hlrc1) / wglMakeCurrent(m_hdc, hlrc2) 

      // and now set shared context for openCL

      cl_context_properties properties[] = { CL_GL_CONTEXT_KHR, (cl_context_properties)wglGetCurrentContext(), // WGL   Context CL_WGL_HDC_KHR, (cl_context_properties)wglGetCurrentDC(), // WGL HDC CL_CONTEXT_PLATFORM, (cl_context_properties)cpPlatform, // OpenCL platform 0   };  cl_device_id devices[32]; size_t sizedev; clGetGLContextInfoKHR_fn clGetGLContextInfo = (clGetGLContextInfoKHR_fn)clGetExtensionFunctionAddressForPlatform(cpPlatform, "clGetGLContextInfoKHR");  clGetGLContextInfo(properties, CL_DEVICES_FOR_GL_CONTEXT_KHR, 32 * sizeof(cl_device_id), devices, &sizedev);  cl_uint countdev = (cl_uint)(sizedev / sizeof(cl_device_id)); context = clCreateContext(properties, countdev, devices, NULL, 0, 0); 

      // and then the shared interop memory object is created and passed as kernel argument in openCL

      cl_mem vbo_cl = clCreateFromGLBuffer(context, CL_MEM_READ_WRITE, vboID, NULL); 

      And here come the troubles. If the command wglShareLists(hlrc1, hlrc2) is called, shared VBO has only zeroes instead of vertex positions. If the command wglShareLists(hlrc1, hlrc2) is skipped, VBO has valid values, everything works fine between OpenGL / OpenCL interop, but I cant render the process, because the resorces between OpenGL contexes in Thread1 and Thread2 can't be shared.

      Has anyone tried something like this, is it possible? Or am I doing something in a wrong way?

        • OpenGL / OpenCL interop with shared contexes and multithreading

          I had project where I did use OpenGL/OpenCL interoperability. But I took different approach. i created main openGL context and from this created OpenCL context. Then I created shared OpenGL context which was used for rendering. This worked fine. But without threads. So try switch OpenGL context and use second context for drawing and first for opencl sharing.

            • Re: OpenGL / OpenCL interop with shared contexes and multithreading

              Hi mado,


              I created a simple test by modifying the SimpleGL sample in our SDK, which shows an example of OCL-GL interop.  I added a new thread just to create an extra gl context and to call the wglShareLists as you described it.  The main thread is unmodified and it was able to run the interop example correctly, so it doesn't seem that wglShareList affected the interop.


              Do you mind providing a simple test case to show us the usage and to repro the problem?


                • Re: OpenGL / OpenCL interop with shared contexes and multithreading

                  Hi Siu, thanks for answering,

                  I have created the simplest app test case possible concerning this.


                  Didn't know how to upload it directly to the forum, so I uploaded it here : http://www.megafileupload.com/en/file/392555/GLCLInterop-zip.html

                  I tried it, it works.


                  It is Microsoft Visual Studio 2012 solution. Programs creates simple cube geometry in VBO in the first thread and then calls second thread with opencl kernel. The kernel only prints VBOs values.


                  There are two lines with      g_engine->createVBO();

                  If the first one is uncommented (in Form1.h) it doesnt work, values are zeroes. In that case VBO is created in another thread as the opencl is called.

                  If the second one is uncommented (in threadHolder.cpp) it woorks correctly.


                  Thanks for any help with this.

                    • Re: OpenGL / OpenCL interop with shared contexes and multithreading

                      Hi mado,


                      I think your program is valid and it looks like to be a problem in our driver.  I'll forward your test case and report this problem to our engineering team.  Meanwhile, you'll probably have to create your VBO using the compute thread instead of the rendering thread as a workaround.  Thank you for reporting the problem and for creating the test case.

                        • Re: OpenGL / OpenCL interop with shared contexes and multithreading

                          Ok, thank you.

                          If there will be any news, please update this conversation thread to let me know.

                            • Re: Re: OpenGL / OpenCL interop with shared contexes and multithreading

                              The application is responsible to call glFinish in the GL thread #1 before the CL thread #2 accesses the resource. From the spec:


                              " Synchronizing OpenCL and OpenGL Access to Shared Objects

                              In order to ensure data integrity, the application is responsible for synchronizing access to shared

                              CL/GL objects by their respective APIs. Failure to provide such synchronization may result in

                              race conditions and other undefined behavior including non-portability between


                              Prior to calling clEnqueueAcquireGLObjects, the application must ensure that any pending GL

                              operations which access the objects specified in mem_objects have completed. This may be

                              accomplished portably by issuing and waiting for completion of a glFinish command on all GL

                              contexts with pending references to these objects. Implementations may offer more efficient

                              synchronization methods; for example on some platforms calling glFlush may be sufficient, or

                              synchronization may be implicit within a thread, or there may be vendor-specific extensions that

                              enable placing a fence in the GL command stream and waiting for completion of that fence in the

                              CL command queue. Note that no synchronization methods other than glFinish are portable

                              between OpenGL implementations at this time."


                              So it's an app issue.