11 Replies Latest reply on Sep 15, 2011 7:06 AM by notzed

    acquire framebuffer

    Meteorhead
      howto

      Hi!

      My question is rather simple, yet the answer might not. I would like to know if there's a way to acquire the actual GL buffer being drawn onto the screen? (In case of double buffering this would be the previous completed image, being redisplayed on the screen, not the one that is actually being assembled)

      Is there any way to pass this buffer over to OpenCL? The interop sample shows how to share buffers created from host data, but what are the means to acquire a changing buffer like a double-buffered framebufer?

      If I simply create a GL buffer object with glBindBuffers(GL_FRAME_BUFFER, &frameBuffObj); and call clCreateFromGLBuffer(...); what will happen after glutSwapBuffers(); is called?

      Could anoyone tell me what would be the way to approach this problem?

        • acquire framebuffer
          laobrasuca

          I don't know precisely how to do it, but you've got look at functions like glGenFramebuffers, glBindFramebuffer, glGetFragDataIndex, glGetFramebufferAttachmentParameter. Generic buffer funcionts like glBindBuffer wont work because GL_FRAMEBUFFER is not a valid target for them. With double buffering, you're looking for GL_FRONT or GL_BACK (the back one is where you're composing the scene and front is that to which you're looking at the screen). On the opencl side you should maybe use clCreateFromGLTexture2D instead of clCreateFromGLBuffer, since frame buffers are more like textures than generic buffers. It's all I can tell you, unfortunately, cause, I've never manipulated frame buffers.

            • acquire framebuffer
              debdatta.basu

              Dear Meteor,

              You would ideally do this using Opengl Frame buffer objects. There is no clean way to access the Window-system provided front and back buffers as far as I know. You could create a custom FBO, attach textures to it, and create opencl images from those textures. You can render to that using OpenGL, and then use your Opencl Image to do compute work. You have to synchronize calls to cl/gl using either glfinish, or better, sync objects.

              Hope that helps.

              Debdatta Basu.

              • acquire framebuffer
                debdatta.basu

                Also,

                you can not attach GL buffer objects to a frame buffer. There is No API like glBindBuffers(GL_FRAME_BUFFER, &frameBuffObj);

                The way you do this is creating a custom FBO and attaching TEXTURES to it. You have to read some good frame buffer object tutorials to get a hang of it. Google!

                Debdatta Basu

                  • acquire framebuffer
                    laobrasuca

                    but, check if you can do your processing with opengl shading language (glsl). There you have what's called fragment shader programs, which are sort of kernels where you can change values of the pixels (it uses the same processors that opencl uses). GLSL is not as much flexible as opencl, but if you can do your image processing with it, you will probably reach much better performance than with gl/cl interop.

                      • acquire framebuffer
                        Meteorhead

                        I'm afraid it would be far too difficult to achieve what I want with fragment shaders. I wish to encode a h264 movie out of what is rendered by OGL. There are open source OCL accelerated x264 encoders, but they all utilize input located on the HDD. I find it very unconvenient (and unneccesary) to move data even as far as RAM (not to mention HDD), when the data is already present on the GPU. This is required because not all simulations that use OGL visualization can actually run real-time, but nonetheless, it is nice to use out-of-the-box routines to do drawing, and use this as the basis of accel. encoding.

                        It consumes more and more time to analyze data that are results of simulations, and in many cases animations are an intuitive way of checking validity of results, if one can come up with a reasonable visualization. If there is a powerful set of tools for visualizing in a GPU accelerated manner (OGL), why not use it?

                        I wish to create simulations that run on the GPU via OCL, render output via OGL, than take this complete picture and encode it on the fly with the GPU (or even by a paralell running CPU device).

                        It is a convenience, that OpenGL does most of the hard work of 2D/3D rendering.

                        I would suggest to AMD, that it would be extremely useful, if the OpenVideo Decode API would also define a place for encoding HW. Should vendors have HW accelerated encoders in their HW, it would link to it, and if not, they could implement powerful algorithms tuned to their HW.

                        Anyhow, these are the plans. If anyone has constructive ideas or criticism, I will gladly listen.

                          • acquire framebuffer
                            laobrasuca

                            ok, ok! Well, have a look here.

                            http://developer.apple.com/mac/library/samplecode/OpenCL_Procedural_Noise_Example/ http://developer.apple.com/mac/library/samplecode/OpenCL_Procedural_Geometric_Displacement_Example/

                            old stuff but do some opengl texture processing via opencl. now, what you need is identify the frame buffer id where you have the image.
                            • acquire framebuffer
                              notzed

                               

                              Originally posted by: Meteorhead

                               

                              I wish to create simulations that run on the GPU via OCL, render output via OGL, than take this complete picture and encode it on the fly with the GPU (or even by a paralell running CPU device).

                               

                              It is a convenience, that OpenGL does most of the hard work of 2D/3D rendering.

                               

                              You can't render directly to the screen: you just render to a texture and then use that in your opengl callback.  This is very fast, and in any event can't be avoided.  It's probably just an artefact of the way opengl works and interacts with a multi-window operating system (e.g. the os owns the fb).

                              FWIW I looked into this for my client's application: but it made such a minor difference to performance (even on a laptop being over-worked) that I dropped it.  And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.

                              I was surprised, but maybe i shouldn't have been.  It's really just some extra async copies (dma?) that delay the display of the information but shouldn't hold up anything else assuming there's enough time budget for them to run.  Video display frame-rates are relatively low compred to the processing power and bandwidth on a modern system.  And there's no use trying to show more fps than the monitor can display, and for smooth-enough similation animations you don't even need to do that (imho about 15fps is ok).

                              But you may notice more of a difference depending on how your application works and if your PCIe bus and CPU cores are already busy, and/or if you're working with handheld or low-end hardware (in which case you'll probably have to drop frames for display anyway).

                                • acquire framebuffer
                                  debdatta.basu

                                  @Notzed

                                  >> And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.....

                                  Well, You can use Swing for output without needing any copies. Simply create an openGl widget and render to that. The data is already on the gpu, so theres no need to copy it onto a cpu buffer, only to copy it back for display.

                                   

                                  @Meteor

                                  For your purposes, render to an fbo, use that texture for compression, and then read back to CPU. I havent used tho OpenVideoDecode thing though, so cant help you with that.

                                   

                                  Cheers!

                                  Debdatta Basu.

                                   

                                    • acquire framebuffer
                                      notzed

                                       

                                      Originally posted by: debdatta.basu @Notzed

                                       

                                      >> And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.....

                                       

                                      Well, You can use Swing for output without needing any copies. Simply create an openGl widget and render to that. The data is already on the gpu, so theres no need to copy it onto a cpu buffer, only to copy it back for display.

                                       

                                      Well yes of course, how else do you think I implemented ogl output?

                                      I'm saying it made no noticeable difference to my application.

                                       

                                        • acquire framebuffer
                                          debdatta.basu

                                          Well, What I said was that the async copies you mentionedin these lines:

                                          And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.

                                          and again here:

                                          It's really just some extra async copies (dma?) that delay the display of the information

                                          never really needs to happen, if you use the same GPU for your opencl calculations as well as display.

                                            • acquire framebuffer
                                              notzed

                                               

                                              Originally posted by: debdatta.basu Well, What I said was that the async copies you mentionedin these lines:

                                               

                                              And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.

                                               

                                              and again here:

                                               

                                              It's really just some extra async copies (dma?) that delay the display of the information

                                               

                                              never really needs to happen, if you use the same GPU for your opencl calculations as well as display.

                                               

                                              Err, yeah.  I know. They don't need to happen.

                                              However, if they happen, it makes no difference since the whole system isn't particularly busy.

                                              Do you understand that?