cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

chen_feng
Journeyman III

The traffic type included in Counter:"VertexMemFetched"

On one of my draw call, I got a exteme large value like follows on HD5450 with PerfStudio2.1.

VSVerticesIn:41263 

VertexMemFetched: 136032768

Per Vertex about 3KB. 

Will the other kind of traffic will be included in this counter?

0 Likes
3 Replies
plohrmann
Staff

Does the value always appear high like this, or is it just occassionally? We've found that having Aero mode enabled in Windows can cause inconsistencies in some of the counters.

Generally, this counter is based on the number of fetch requests received for data that is stored in a vertex-related format. There is a possibility that an inefficient index buffer could cause a single vertex to be fetched multiple times. I'm not entirely sure if accessing vertex attributes or performance texture fetches in the vertex shader will contribute to this value. I will look into this.

We have ongoing research to identify better counters related to index buffer quality / vertex cache reuse and this counter may be adjusted in the process.

0 Likes

It's draw call by draw call. For the frame I studied, that counter values look reasonable for most of draw call except this one. The Aero mode was disabled during the profiling.  The Primitive number for that draw call was about 33973. So there won't be too much index buffer access. Actually, this is one frame in 3DMarkVantage GT1 Performance setting.

//
// Generated by Microsoft (R) HLSL Shader Compiler 9.19.949.2111
//
//
// Buffer Definitions:
//
// cbuffer view_changes_every_frame
// {
//
//   row_major float4x4 world_to_view_clip_matrix;// Offset:    0 Size:    64
//   row_major float4x4 view_to_world_matrix;// Offset:   64 Size:    64 [unused]
//   float3 camera_in_world;            // Offset:  128 Size:    12 [unused]
//   row_major float4x4 frustum_planes; // Offset:  144 Size:    64 [unused]
//   row_major float4x4 inverse_projection_matrix;// Offset:  208 Size:    64 [unused]
//
// }
//
// cbuffer xsi_uniforms
// {
//
//   float wave_intensity;              // Offset:    0 Size:     4
//   float foam_height_effect;          // Offset:    4 Size:     4
//   float foam_saturate_low_threshold; // Offset:    8 Size:     4
//   float foam_saturate_high_threshold;// Offset:   12 Size:     4
//   float3 water_diffuse_color;        // Offset:   16 Size:    12 [unused]
//   float3 specular_color;             // Offset:   32 Size:    12 [unused]
//   float specular_exponent;           // Offset:   44 Size:     4 [unused]
//   float specular_intensity;          // Offset:   48 Size:     4 [unused]
//   float foam_tiling_factor_x;        // Offset:   52 Size:     4 [unused]
//   float foam_tiling_factor_y;        // Offset:   56 Size:     4 [unused]
//   float foam_visibility_low_threshold;// Offset:   60 Size:     4 [unused]
//   float foam_visibility_high_threshold;// Offset:   64 Size:     4 [unused]
//   float reflection_displacement;     // Offset:   68 Size:     4 [unused]
//   float foam_distort_factor;         // Offset:   72 Size:     4 [unused]
//   float refraction_map_intensity;    // Offset:   76 Size:     4 [unused]
//   float water_diffuse_scattered;     // Offset:   80 Size:     4 [unused]
//
// }
//
// cbuffer changes_every_call
// {
//
//   row_major float4x4 object_to_world_matrix;// Offset:    0 Size:    64
//
// }
//
//
// Resource Bindings:
//
// Name                                 Type  Format         Dim Slot Elements
// ------------------------------ ---------- ------- ----------- ---- --------
// height_texture                    texture  float4          2d    0        1
// view_changes_every_frame          cbuffer      NA          NA    0        1
// xsi_uniforms                      cbuffer      NA          NA    1        1
// changes_every_call                cbuffer      NA          NA    2        1
//
//
//
// Input signature:
//
// Name                 Index   Mask Register SysValue Format   Used
// -------------------- ----- ------ -------- -------- ------ ------
// SV_Position              0   xyzw        0     NONE  float   xyzw
// TEXCOORD                 0   xyz         1     NONE  float   xyz
//
//
// Output signature:
//
// Name                 Index   Mask Register SysValue Format   Used
// -------------------- ----- ------ -------- -------- ------ ------
// SV_Position              0   xyzw        0      POS  float   xyzw
// POSITION                 1   xyz         1     NONE  float   xyz
// TEXCOORD                 0   xyzw        2     NONE  float   xyz
//
vs_4_0
dcl_input v0.xyzw
dcl_input v1.xyz
dcl_output_siv o0.xyzw, position
dcl_output o1.xyz
dcl_output o2.xyz
dcl_constantbuffer cb0[4], immediateIndexed
dcl_constantbuffer cb1[1], immediateIndexed
dcl_constantbuffer cb2[4], immediateIndexed
dcl_resource_texture2d (float,float,float,float) t0
dcl_temps 2
add r0.x, -cb1[0].z, cb1[0].w
div r0.x, l(1.000000, 1.000000, 1.000000, 1.000000), r0.x
resinfo r1.xyzw, l(0), t0.xyzw
mul r0.yz, r1.xxyx, v1.xxyx
ftoi r1.xy, r0.yzyy
mov r1.zw, l(0,0,0,0)
ld r1.xyzw, r1.xyzw, t0.xyzw
add r0.y, r1.y, -cb1[0].z
mul_sat r0.x, r0.x, r0.y
mad r0.y, r0.x, l(-2.000000), l(3.000000)
mul r0.x, r0.x, r0.x
mul r0.x, r0.y, r0.x
mul r0.x, r1.y, r0.x
mul r0.x, r0.x, cb1[0].y
mad r0.x, r1.x, cb1[0].x, r0.x
add r0.y, r0.x, v0.y
mov r0.xzw, v0.xxzw
dp4 r1.w, cb2[3].xyzw, r0.xyzw
dp4 r1.x, cb2[0].xyzw, r0.xyzw
dp4 r1.y, cb2[1].xyzw, r0.xyzw
dp4 r1.z, cb2[2].xyzw, r0.xyzw
dp4 o0.x, cb0[0].xyzw, r1.xyzw
dp4 o0.y, cb0[1].xyzw, r1.xyzw
dp4 o0.z, cb0[2].xyzw, r1.xyzw
dp4 o0.w, cb0[3].xyzw, r1.xyzw
mov o1.xyz, r1.xyzx
mov o2.xyz, v1.xyzx
ret
// Approximately 28 instruction slots used

0 Likes

APIDrawIndexed
Draw_Call445
CBMemRead0
CBMemWritten3941952
ClippedPrims0
CulledPrims23527
DepthStencilTestBusy0.448353
GPUBusy99.99967
GPUTime31.43322
GSALUBusy0
GSALUEfficiency0
GSALUInstCount0
GSALUTexRatio0
GSExportPct0
GSPrimsIn0
GSTexBusy0
GSTexInstCount0
GSVerticesOut0
HiZReject20.19346
HiZTrivialAccept0
PAStalledOnRasterizer98.16735
PSALUBusy72.58314
PSALUEfficiency62.37532
PSALUInstCount406.9916
PSALUTexRatio6.154698
PSExportStalls0
PSPixelsIn374372
PSPixelsOut374372
PSTexBusy47.17455
PSTexInstCount66.12528
Pct128SlowTexels0
Pct64SlowTexels3.772305
PctCompressedTexels3.715805
PctDepthTexels31.16592
PctInterlacedTexels0
PctTex1D27.811
PctTex1DArray15.52234
PctTex2D0
PctTex2DArray56.57877
PctTex2DMSAA0
PctTex2DMSAAArray0
PctTex3D0
PctTexCube0
PctTexCubeArray0
PctUncompressedTexels37.33273
PctVertex128SlowTexels0
PctVertex64SlowTexels0
PctVertexTexels27.56327
PostZSamplesFailingS0
PostZSamplesFailingZ0
PostZSamplesPassing0
PreZSamplesFailingS0
PreZSamplesFailingZ37709
PreZSamplesPassing366785
PrimitiveAssemblyBusy99.47789
PrimitivesIn33973
ShaderBusy99.39033
ShaderBusyGS0
ShaderBusyPS99.44472
ShaderBusyVS0.555278
TexAveAnisotropy0.973451
TexCacheStalled0.457929
TexCostOfFiltering110.9878
TexMemBytesRead25280000
TexMissRate0.20409
TexTriFilteringPct2.574058
TexUnitBusy38.02385
TexVolFilteringPct0
TexelFetchCount61786368
VSALUBusy0.264587
VSALUEfficiency53.33333
VSALUInstCount21
VSALUTexRatio10.5
VSTexBusy0.001008
VSTexInstCount2
VSVerticesIn41263
VertexMemFetched1.36E+08
VertexMemFetchedCost20.78471
ZUnitStalled0
0 Likes