1 Reply Latest reply on Feb 23, 2012 4:50 PM by Byron

    Slow performance of rendering quads in D3D if using big vertex buffer

    pkirill_ws

      Hi Team,

       

      I have an performance issue with 5770 ATI card with the latest drivers and with windows7 32 bits:

       

      The program does the following :

      It has a VertexBuffer, which is dynamic : D3DUSAGE_DYNAMIC|D3DUSAGE_WRITEONLY, D3DPOOL_DEFAULT

      Assume I need to render 50K quads. Assume the buffer capacity is N quads.

      How do I render :

          SetTexture(0, t);

         for (..) {

            LockEntireBuffer(D3DLOCK_DISCARD);

            Fill The Buffer With Quads Data

            Unlock();

            DrawIndexedPrimitive();

         }

       

      The problem is following:

        When the vertex buffer (42 quads * 24 bytes per vertex) is really small ( <4kbyte) then rendering is very fast (lets say 64 fps, no-v-sync);

        When the buffer more then 4k the rendering become very slow : 43 fps,

        When we start growing buffer to 96k (1024 quads) then fps slowly growing up to 57 FPS.

       

      Our profiling tells that after sizeof the buffer exceeded 4k, the program starts spending too much time in D3DKMTLock.

       

      The problem is not reproducible when we change textures frequently :

       

         for (..) {

            LockEntireBuffer(D3DLOCK_DISCARD);

            Fill The Buffer With Quads Data

            Unlock();

             for (...) { SetTexture(t[i]); DrawIndexedPrimitive(part of the buffer); }

         }

       

      In that situation the bigger buffer is, the faster it renders.

       

      What am I doing wrong ?

       

      Thanks

        • Re: Slow performance of rendering quads in D3D if using big vertex buffer
          Byron

          When the D3DLOCK_DISCARD flag is used, internally a new buffer is created to make sure there’s no contention between the GPU and the app that’s trying to write into the buffer.

           

          If you’re using really large vertex buffers, and calling Lock() often, it could be that a limit is being hit, preventing new buffers from being created. One thing to try is allocating a circular queue of buffers and not using D3DLOCK_DISCARD.

           

          In general it’s not good for performance to update so much vertex buffer data so often, so it would be better to could avoid doing this if possible.

           

          If this is for animation, you may want to do skinning on the GPU, which wouldn’t require updating every vertex for every draw call.

           

          If you don’t need to update all of the vertices, then it’s better to not lock the whole buffer and use the D3DLOCK_NOOVERWRITE flag.

           

          This article explains it (Using Dynamic Vertex and Index Buffers, Usage Style 2)

          http://msdn.microsoft.com/en-us/library/windows/desktop/bb147263(v=vs.85).aspx#Using_Dynamic_Vertex_and_Index_Buffers.