If it is possible, can you email a test shader showing this to firstname.lastname@example.org
In fact shader is simple enough to just post it here:
sample_resource(0)_sampler(0) r0, vWinCoord0.xyxx
sample_resource(1)_sampler(0) r1, vWinCoord0.xyxx
sample_resource(2)_sampler(0) r2, vWinCoord0.xyxx
There no add,0x80000000 in compiled output, only 2 adds and 3 xors.
What are the range of values you're sampling? Also, what result is placed into the output stream o0? Is it just the value you sampled xord by INT_MIN+1? Finally, do you know the addition isn't taking place because your result isn't what you expected, or did you disassemble the kernel?
Originally posted by: rick.weber What are the range of values you're sampling? Also, what result is placed into the output stream o0? Is it just the value you sampled xord by INT_MIN+1? Finally, do you know the addition isn't taking place because your result isn't what you expected, or did you disassemble the kernel?
I've disassembled the kernel after i've got wrong results on several sample datas. Actually it's easy to see how this kernel will be compiled via Stream KernelAnalyzer -- just copy/paste kernel there, results (for RV770) will be:
; -------- Disassembly --------------------
00 TEX: ADDR(64) CNT(3) VALID_PIX
0 SAMPLE R1.x___, R0.xyxx, t0, s0 UNNORM(XYZW)
1 SAMPLE R2.x___, R0.xyxx, t2, s0 UNNORM(XYZW)
2 SAMPLE R0.x___, R0.xyxx, t1, s0 UNNORM(XYZW)
01 ALU: ADDR(32) CNT(19)
3 x: ADD_INT ____, R2.x, (0x80000001, -1.401298464e-45f).x
y: ADD_INT ____, (0x7FFFFFFF, 1.#QNANf).y, R1.x
z: XOR_INT R0.z, R0.x, (0x80000001, -1.401298464e-45f).x VEC_201
4 x: XOR_INT R0.x, (0x7FFFFFFF, 1.#QNANf).x, PV3.x
w: XOR_INT R0.w, PV3.y, (0x80000000, 0.0f).y
5 x: MOV R3.x, PV4.x
y: MOV R3.y, R0.y
z: MOV R3.z, R0.y
w: MOV R3.w, R0.y
6 x: MOV R2.x, R0.z
y: MOV R2.y, R0.y
z: MOV R2.z, R0.y
w: MOV R2.w, R0.y
7 x: MOV R1.x, R0.w
y: MOV R1.y, R0.y
z: MOV R1.z, R0.y
w: MOV R1.w, R0.y
02 EXP_DONE: PIX0, R1 BRSTCNT(2)
As you see, no iadd for 0x80000000. I suspect the reason of this that (unsigned int)0x80000000 == -0.0f. And CAL CL decided to remove "unnecessary" addition with "zero" being wrong about int addition == float one.
This has been fixed and should be in 1.4
OK, thanks, I'll be waiting for 1.4 then.
Is there a defined way of circumventing this bug? It's really annoying
I'm just installed 1.4 SDK and... bug is still there :/.
Doesn't looks good at all...
I'm just realized that updating SDK means nothing as all compiler logic done by dlls which are only updating when Catalyst driver updating. So as long as Catalyst still 9.2 nothing will change.
And also as Stream doesn't looks like top priority for ATI/AMD, only CAL compiler bug fixes isn't enough to start process of Catalyst driver update. So we need to wait some other major Catalyst driver update to see any SDK change.
Am I right?
The SDK and CAL are no longer directly connected. This was done to make all graphic cards CAL ready so that people could develop applications and have them run on machines with Radeons without requiring the users to download the SDK.
The downside to this is that the SDK and CAL move at different speeds. Where the SDK is updated quarterly, the driver is update monthly but it follows the driver development cycle which used to be explained here: http://www.phoronix.com/vr.php?view=10083. The basics is that it takes three months for a feature/bug fix to go from implementation through testing and release. This bug was fixed last month, so it should be public in the next one or two driver releases.
I'll really prefer to have most updated compiler at all times rather than single driver distribution as I' not using runtime calcl* calls anyway -- all kernels precompiled to elf binaries. But it doesn't looks like there's an option.
Anyway, thanks for a reply.