cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

calCL ignoring 0x80000000 constant when adding?

iadd reg,0x80000000 == nothing

I've AMD IL code that looks like:

dcl_literal l12,0x80000000,0x80000000,0x80000000,0x80000000

...
mov r45.x,l12.x

...
iadd r34.x,r34.x,r45.x

If constant for iadd == 0x80000000 then calCL just ignoring it. In compiled code there no ADD_INT instruction generated. When changing constant to anything else (like 0x7ffffff or 0x80000001) everything became OK.

I've started to think that I missed something in declaration of constant (signed/unsigned) but ixor r34.x,r34.x,r45.x working ok with correct XOR_INT instruction generated.

Is this calCL compiler/optimizer bug or am I missed something being not so familiar with AMD IL atm?

 

0 Likes
11 Replies

empty_knapsack, 

 If it is  possible, can you email a test shader showing this to streamdeveloper@amd.com and have them forward it to me so I can verify and work on getting it fixed?

 

Thanks,

0 Likes

 If it is  possible, can you email a test shader showing this to streamdeveloper@amd.com



Done.

In fact shader is simple enough to just post it here:

il_ps_2_0
dcl_input_position_interp(linear_noperspective) vWinCoord0.xy__
dcl_output_generic o0
dcl_output_generic o1
dcl_output_generic o2
dcl_cb cb0[4]
dcl_resource_id(0)_type(2d,unnorm)_fmtx(float)_fmty(float)_fmtz(float)_fmtw(float)
dcl_resource_id(1)_type(2d,unnorm)_fmtx(float)_fmty(float)_fmtz(float)_fmtw(float)
dcl_resource_id(2)_type(2d,unnorm)_fmtx(float)_fmty(float)_fmtz(float)_fmtw(float)
sample_resource(0)_sampler(0) r0, vWinCoord0.xyxx
sample_resource(1)_sampler(0) r1, vWinCoord0.xyxx
sample_resource(2)_sampler(0) r2, vWinCoord0.xyxx
dcl_literal l1,0x7fffffff,0x7fffffff,0x7fffffff,0x7fffffff
dcl_literal l2,0x80000000,0x80000000,0x80000000,0x80000000
dcl_literal l3,0x80000001,0x80000001,0x80000001,0x80000001

iadd r10.x,r0.x,l1.x
ixor r10.x,r10.x,l2.x

iadd r11.x,r1.x,l2.x
ixor r11.x,r11.x,l3.x

iadd r12.x,r2.x,l3.x
ixor r12.x,r12.x,l1.x

mov o0,r10
mov o1,r11
mov o2,r12
end

There no add,0x80000000 in compiled output, only 2 adds and 3 xors.

0 Likes

What are the range of values you're sampling? Also, what result is placed into the output stream o0? Is it just the value you sampled xord by INT_MIN+1? Finally, do you know the addition isn't taking place because your result isn't what you expected, or did you disassemble the kernel?

0 Likes

Originally posted by: rick.weber What are the range of values you're sampling? Also, what result is placed into the output stream o0? Is it just the value you sampled xord by INT_MIN+1? Finally, do you know the addition isn't taking place because your result isn't what you expected, or did you disassemble the kernel?

I've disassembled the kernel after i've got wrong results on several sample datas. Actually it's easy to see how this kernel will be compiled via Stream KernelAnalyzer -- just copy/paste kernel there, results (for RV770) will be:

; --------  Disassembly --------------------
00 TEX: ADDR(64) CNT(3) VALID_PIX
      0  SAMPLE R1.x___, R0.xyxx, t0, s0  UNNORM(XYZW)
      1  SAMPLE R2.x___, R0.xyxx, t2, s0  UNNORM(XYZW)
      2  SAMPLE R0.x___, R0.xyxx, t1, s0  UNNORM(XYZW)
01 ALU: ADDR(32) CNT(19)
      3  x: ADD_INT     ____,  R2.x,  (0x80000001, -1.401298464e-45f).x     
         y: ADD_INT     ____,  (0x7FFFFFFF, 1.#QNANf).y,  R1.x     
         z: XOR_INT     R0.z,  R0.x,  (0x80000001, -1.401298464e-45f).x      VEC_201
      4  x: XOR_INT     R0.x,  (0x7FFFFFFF, 1.#QNANf).x,  PV3.x     
         w: XOR_INT     R0.w,  PV3.y,  (0x80000000, 0.0f).y     
      5  x: MOV         R3.x,  PV4.x     
         y: MOV         R3.y,  R0.y     
         z: MOV         R3.z,  R0.y     
         w: MOV         R3.w,  R0.y     
      6  x: MOV         R2.x,  R0.z     
         y: MOV         R2.y,  R0.y     
         z: MOV         R2.z,  R0.y     
         w: MOV         R2.w,  R0.y     
      7  x: MOV         R1.x,  R0.w     
         y: MOV         R1.y,  R0.y     
         z: MOV         R1.z,  R0.y     
         w: MOV         R1.w,  R0.y     
02 EXP_DONE: PIX0, R1  BRSTCNT(2)
END_OF_PROGRAM

 

As you see, no iadd for 0x80000000. I suspect the reason of this that (unsigned int)0x80000000 == -0.0f. And CAL CL decided to remove "unnecessary" addition with "zero" being wrong about int addition == float one.

0 Likes

This has been fixed and should be in 1.4

0 Likes

OK, thanks, I'll be waiting for 1.4 then.

0 Likes

Is there a defined way of  circumventing this bug? It's really annoying

0 Likes

I'm just installed 1.4 SDK and... bug is still there :/.

 

Doesn't looks good at all...

0 Likes

I'm just realized that updating SDK means nothing as all compiler logic done by dlls which are only updating when Catalyst driver updating. So as long as Catalyst still 9.2 nothing will change.

And also as Stream doesn't looks like  top priority for ATI/AMD, only CAL compiler bug fixes isn't enough to start process of Catalyst driver update. So we need to wait some other major Catalyst driver update to see any SDK change.

Am I right?

0 Likes

The SDK and CAL are no longer directly connected. This was done to make all graphic cards CAL ready so that people could develop applications and have them run on machines with Radeons without requiring the users to download the SDK. 

The downside to this is that the SDK and CAL move at different speeds. Where the SDK is updated quarterly, the driver is update monthly but it follows the driver development cycle which used to be explained  here: http://www.phoronix.com/vr.php?view=10083. The basics is that it takes three months for a feature/bug fix to go from implementation through testing and release. This bug was fixed last month, so it should be public in the next one or two driver releases.

0 Likes

/sigh

 

I'll really prefer to have most updated compiler at all times rather than single driver distribution as I' not using runtime calcl* calls anyway -- all kernels precompiled to elf binaries. But it doesn't looks like there's an option.

 

Anyway, thanks for a reply.

0 Likes