Archives Discussions

varrak · ‎12-07-2013

Hey all,

So I'm desperately trying to get instancing running at an acceptable performance level. So far, the best I have is to generate a number of textures (4096x4, and 4096x1 for index data so I can cull instances). The 4096x4 is static, the 4096x1 is dynamic and updated every frame. This actually works really well on Intel and NVidia hardware (rendering's very quick) but on AMD, the buffer update is massively slow (about 0.5ms per texture update, whether or not I double buffer or not).

To try and alleviate this, I thought - hey! Why not use TextureBuffers! I can do all my instances in one draw, and use glMapBuffer which should be really fast, and a lot simpler. However, this is even slower! Even when quadruple-buffering (to ensure I'm not blocking on a buffer that's locked and still in use) it takes around 3ms for *each* glMapBuffer call. 3ms! That's insane! I had assumed this approach would be similar to the D3D11 map semantic (and that using glMapBufferRange with the right parameters would be a map/discard semantic), but while this would run insanely fast on D3D, it's single-digit frame-rate on OpenGL...

To ensure there's not something funky going on with the GPU, I tried this just locking the data (so only updating once) and the framerate shot up from single digits to 120fps.

Any insight anyone would have would be greatly, massively appreciated. I'm pulling my hair out here...

Archives Discussions

glMapBuffer performance