Hi Team,
I have an performance issue with 5770 ATI card with the latest drivers and with windows7 32 bits:
The program does the following :
It has a VertexBuffer, which is dynamic : D3DUSAGE_DYNAMIC|D3DUSAGE_WRITEONLY, D3DPOOL_DEFAULT
Assume I need to render 50K quads. Assume the buffer capacity is N quads.
How do I render :
SetTexture(0, t);
for (..) {
LockEntireBuffer(D3DLOCK_DISCARD);
Fill The Buffer With Quads Data
Unlock();
DrawIndexedPrimitive();
}
The problem is following:
When the vertex buffer (42 quads * 24 bytes per vertex) is really small ( <4kbyte) then rendering is very fast (lets say 64 fps, no-v-sync);
When the buffer more then 4k the rendering become very slow : 43 fps,
When we start growing buffer to 96k (1024 quads) then fps slowly growing up to 57 FPS.
Our profiling tells that after sizeof the buffer exceeded 4k, the program starts spending too much time in D3DKMTLock.
The problem is not reproducible when we change textures frequently :
for (..) {
LockEntireBuffer(D3DLOCK_DISCARD);
Fill The Buffer With Quads Data
Unlock();
for (...) { SetTexture(t); DrawIndexedPrimitive(part of the buffer); }
}
In that situation the bigger buffer is, the faster it renders.
What am I doing wrong ?
Thanks