Currently I've been working on increasing the performance of character rendering by first performing the skinning in a vertex shader using transform feedbacks, and then render the skinned version of the mesh each time it's required (deferred, shadows, picking). So the old GL3.3 APIs would use an ordinary OpenGL buffer, bound with GL_TRANSFORM_FEEDBACK_BUFFER, do glBeginTransformFeedback->Draw->glEndTransformFeedback then query the amount of vertices being written and then bind the same buffer as GL_ARRAY_BUFFER and then render using said amount of vertices. But with the API provided in GL4.0, we can instead use transform feedback objects which should handle how many vertices is written to it from a transform feedback draw, which is nice because we don't have to stall the driver; as was the intended purpose, if I understood it correctly.
Here's the problem, only the first updated mesh is rendered when using the following code (which is done per each mesh I want to update):
glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, fb->GetOGL4TransformFeedback()); glBeginTransformFeedback(primType); glEnable(GL_RASTERIZER_DISCARD); glBindVertexArray(vl->GetOGL4VertexArrayObject()); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ib->GetOGL4IndexBuffer()); glDrawElementsBaseVertex(primType, this->primitiveGroup.GetNumIndices(), indexType, NULL, this->primitiveGroup.GetBaseVertex()); glDisable(GL_RASTERIZER_DISCARD); glEndTransformFeedback(); glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, 0);
The transform feedback is unique for every model I intend to render, and it has its own buffer linked to it as well. When rendering using:
glBindVertexArray(vl->GetOGL4VertexArrayObject()); glDrawTransformFeedback(primType, fb->GetOGL4TransformFeedback());
For all intended models, only the first gets rendered, which I assume is because the transform feedback is not complete when the draw command comes. I make this assumption because any call to glFlush, or any query object read (even using GL_QUERY_RESULT_NO_WAIT) results in all models rendering, but also obviously results in a major performance impact because every time I update a feedback buffer (which is once per each animated character) it results in a synchronization point. I've tried this on an nVidia-card too (790), and it doesn't produce the same result, instead all objects are rendered as they should be, however the nVidia driver seems to implicitly enforce a synchronization which, much to my disappointment, results in the same sluggish performance as using glFlush. Is this to be assumed? It's actually faster right now to perform the skinning each time I need to do the actual rendering. I have no other reason but to assume I'm doing something wrong here, but what is it?
Thanks in advance!