cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Meteorhead
Challenger

acquire framebuffer

howto

Hi!

My question is rather simple, yet the answer might not. I would like to know if there's a way to acquire the actual GL buffer being drawn onto the screen? (In case of double buffering this would be the previous completed image, being redisplayed on the screen, not the one that is actually being assembled)

Is there any way to pass this buffer over to OpenCL? The interop sample shows how to share buffers created from host data, but what are the means to acquire a changing buffer like a double-buffered framebufer?

If I simply create a GL buffer object with glBindBuffers(GL_FRAME_BUFFER, &frameBuffObj); and call clCreateFromGLBuffer(...); what will happen after glutSwapBuffers(); is called?

Could anoyone tell me what would be the way to approach this problem?

0 Likes
11 Replies
laobrasuca
Journeyman III

I don't know precisely how to do it, but you've got look at functions like glGenFramebuffers, glBindFramebuffer, glGetFragDataIndex, glGetFramebufferAttachmentParameter. Generic buffer funcionts like glBindBuffer wont work because GL_FRAMEBUFFER is not a valid target for them. With double buffering, you're looking for GL_FRONT or GL_BACK (the back one is where you're composing the scene and front is that to which you're looking at the screen). On the opencl side you should maybe use clCreateFromGLTexture2D instead of clCreateFromGLBuffer, since frame buffers are more like textures than generic buffers. It's all I can tell you, unfortunately, cause, I've never manipulated frame buffers.

0 Likes

Dear Meteor,

You would ideally do this using Opengl Frame buffer objects. There is no clean way to access the Window-system provided front and back buffers as far as I know. You could create a custom FBO, attach textures to it, and create opencl images from those textures. You can render to that using OpenGL, and then use your Opencl Image to do compute work. You have to synchronize calls to cl/gl using either glfinish, or better, sync objects.

Hope that helps.

Debdatta Basu.

0 Likes

Also,

you can not attach GL buffer objects to a frame buffer. There is No API like glBindBuffers(GL_FRAME_BUFFER, &frameBuffObj);

The way you do this is creating a custom FBO and attaching TEXTURES to it. You have to read some good frame buffer object tutorials to get a hang of it. Google!

Debdatta Basu

0 Likes

but, check if you can do your processing with opengl shading language (glsl). There you have what's called fragment shader programs, which are sort of kernels where you can change values of the pixels (it uses the same processors that opencl uses). GLSL is not as much flexible as opencl, but if you can do your image processing with it, you will probably reach much better performance than with gl/cl interop.

0 Likes

I'm afraid it would be far too difficult to achieve what I want with fragment shaders. I wish to encode a h264 movie out of what is rendered by OGL. There are open source OCL accelerated x264 encoders, but they all utilize input located on the HDD. I find it very unconvenient (and unneccesary) to move data even as far as RAM (not to mention HDD), when the data is already present on the GPU. This is required because not all simulations that use OGL visualization can actually run real-time, but nonetheless, it is nice to use out-of-the-box routines to do drawing, and use this as the basis of accel. encoding.

It consumes more and more time to analyze data that are results of simulations, and in many cases animations are an intuitive way of checking validity of results, if one can come up with a reasonable visualization. If there is a powerful set of tools for visualizing in a GPU accelerated manner (OGL), why not use it?

I wish to create simulations that run on the GPU via OCL, render output via OGL, than take this complete picture and encode it on the fly with the GPU (or even by a paralell running CPU device).

It is a convenience, that OpenGL does most of the hard work of 2D/3D rendering.

I would suggest to AMD, that it would be extremely useful, if the OpenVideo Decode API would also define a place for encoding HW. Should vendors have HW accelerated encoders in their HW, it would link to it, and if not, they could implement powerful algorithms tuned to their HW.

Anyhow, these are the plans. If anyone has constructive ideas or criticism, I will gladly listen.

0 Likes

ok, ok! Well, have a look here.

http://developer.apple.com/mac/library/samplecode/OpenCL_Procedural_Noise_Example/ http://developer.apple.com/mac/library/samplecode/OpenCL_Procedural_Geometric_Displacement_Example/

old stuff but do some opengl texture processing via opencl. now, what you need is identify the frame buffer id where you have the image.
0 Likes

Originally posted by: Meteorhead

 

I wish to create simulations that run on the GPU via OCL, render output via OGL, than take this complete picture and encode it on the fly with the GPU (or even by a paralell running CPU device).

 

It is a convenience, that OpenGL does most of the hard work of 2D/3D rendering.

 

You can't render directly to the screen: you just render to a texture and then use that in your opengl callback.  This is very fast, and in any event can't be avoided.  It's probably just an artefact of the way opengl works and interacts with a multi-window operating system (e.g. the os owns the fb).

FWIW I looked into this for my client's application: but it made such a minor difference to performance (even on a laptop being over-worked) that I dropped it.  And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.

I was surprised, but maybe i shouldn't have been.  It's really just some extra async copies (dma?) that delay the display of the information but shouldn't hold up anything else assuming there's enough time budget for them to run.  Video display frame-rates are relatively low compred to the processing power and bandwidth on a modern system.  And there's no use trying to show more fps than the monitor can display, and for smooth-enough similation animations you don't even need to do that (imho about 15fps is ok).

But you may notice more of a difference depending on how your application works and if your PCIe bus and CPU cores are already busy, and/or if you're working with handheld or low-end hardware (in which case you'll probably have to drop frames for display anyway).

0 Likes

@Notzed

>> And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.....

Well, You can use Swing for output without needing any copies. Simply create an openGl widget and render to that. The data is already on the gpu, so theres no need to copy it onto a cpu buffer, only to copy it back for display.

 

@Meteor

For your purposes, render to an fbo, use that texture for compression, and then read back to CPU. I havent used tho OpenVideoDecode thing though, so cant help you with that.

 

Cheers!

Debdatta Basu.

 

0 Likes

Originally posted by: debdatta.basu @Notzed

 

>> And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.....

 

Well, You can use Swing for output without needing any copies. Simply create an openGl widget and render to that. The data is already on the gpu, so theres no need to copy it onto a cpu buffer, only to copy it back for display.

 

Well yes of course, how else do you think I implemented ogl output?

I'm saying it made no noticeable difference to my application.

 

0 Likes

Well, What I said was that the async copies you mentionedin these lines:

And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.

and again here:

It's really just some extra async copies (dma?) that delay the display of the information

never really needs to happen, if you use the same GPU for your opencl calculations as well as display.

0 Likes

Originally posted by: debdatta.basu Well, What I said was that the async copies you mentionedin these lines:

 

And the non OGL version is just using Swing for output(!!) which requires at least one more redundant on-CPU copy to move the data to managed memory from JOGL.

 

and again here:

 

It's really just some extra async copies (dma?) that delay the display of the information

 

never really needs to happen, if you use the same GPU for your opencl calculations as well as display.

 

Err, yeah.  I know. They don't need to happen.

However, if they happen, it makes no difference since the whole system isn't particularly busy.

Do you understand that?

 

0 Likes