SKA shows no scratchshould be used but profiler uses scratch registers.
For same kernel SKA shows:
Name,Scratch Reg,GPR,Min,Max,Avg,Est Cycles,Est Cycles(Bi),ALU:Fetch(Bi),BottleNeck(Bi),%s\Clock(Bi),Throughput(Bi)
Radeon HD 4870,0,119,244.90,36705.76,6919.27,6919.27,6919.27,1.19,ALU Ops,0.00,2 M Threads\Sec
0 scratch registers and 119 GPR.
And indeed, HD4870 assembly doesn't contain scratch read/write instructions.
But profiler shows 121 GPR and scratch registers usage!
And its assembly shows scratch instructions.
MaxScratchRegsNeeded = 98
SQ_PGM_RESOURCES:NUM_GPRS = 121
Why disagreement in so important thing for performance as scratch registers usage?