Hi
I have this fragment shader (GLSL)
void main()
{
gl_FragColor = clamp(gl_TexCoord[0], 0.0, 1.0);
return;
}
Disassembly is: (RV770, latest shader analyzer)
; -------- Disassembly --------------------
00 ALU: ADDR(32) CNT(8)
0 x: MAX ____, R0.w, 0.0f
y: MAX ____, R0.z, 0.0f
z: MAX ____, R0.y, 0.0f
w: MAX ____, R0.x, 0.0f
1 x: MIN R0.x, PV0.w, 1.0f
y: MIN R0.y, PV0.z, 1.0f
z: MIN R0.z, PV0.y, 1.0f
w: MIN R0.w, PV0.x, 1.0f
01 EXP_DONE: PIX0, R0
END_OF_PROGRAM
--> 2 ALU
now I have this pixel shader (DX9 HLSL)
float4 main(float4 Val : TEXCOORD0) : COLOR0
{
return clamp(Val, 0.0, 1.0);
}
Disassembly is: (RV770, latest shader analyzer)
; -------- Disassembly --------------------
00 ALU: ADDR(32) CNT(4)
0 x: MOV R0.x, R0.x CLAMP
y: MOV R0.y, R0.y CLAMP
z: MOV R0.z, R0.z CLAMP
w: MOV R0.w, R0.w CLAMP
01 EXP_DONE: PIX0, R0
END_OF_PROGRAM
--> 1 ALU
I have some complex shaders that are ALU bound (and they use lots of clamp(val, 0, 1) / saturate(val))
In HLSL they take about 80 cycles to execute (estimated)
but identical shader in GLSL takes 120 cycles (estimated) to execute
couse each saturate/clamp (from 0 to 1) is expanded to max/min sequence
(and number of alu instructions gets higher ;/)
Is this GLSL compiller flaw ? Or there is some nasty way to force GLSL to generate
'mov reg, reg, clamp' in microcode ?
Any fix for this is expected soon ?