6 Replies Latest reply on Dec 19, 2012 10:27 PM by drallan

    Gcn compiler bug, lds pointer arithmetic?

    drallan

      The following seems to be a bug in the gcn compiler, starting with catalyst 12.9, and does not occur on other architectures like Cayman and Barts. The test program fills an array[64] with some data and then reads it back in reverse order. To do this, the compiler computes the offset (63 - gx) to pointer pB where gx >= 0 && < 64. The problem occurs because the new compiler uses the constant value 63 as an offset in the ds_read_b32 instruction, which then causes the register address of ds_read_b32 to be negative!. The old compiler does not use the ds_read_b32 offset option. Several similar expressions cause the same problem.  Unfortunately this is a pretty common 'C' expression.....

       

                       result = *(pB + (63 - gx));

       

      Just a comment, gcn is still relatively new and being optimized so bugs can happen however, the compiler is often beyond amazing in the code it produces and getting faster. Go compiler team.

       

      __kernel void localpointer_rw(global int *in, global int *out)

      {

          int i, gx = get_global_id(0);   //gx goes from 0 to 63

          local int buf[64],*pB;

       

           buf[gx] = gx;            //assign 64 numbers to all buf[n] locations

           pB = &buf[0];            //make a pointer

           i = *(pB+(63-gx));       //ERROR! reads only when gx == 0 !!

           out[gx] = i;

      }

       

      Dissassembly from (good) old compiler

        v_sub_i32      v1, vcc, 0, v0   <------  start:gx in v0; gx = -gx;

        v_lshlrev_b32  v1, 2, v1                 mmultiply by 4

        v_add_i32      v1, vcc, 0x000000fc, v1   add constant offset 252

        v_lshlrev_b32  v2, 2, v0

        ds_write_b32   v2, v0

        s_waitcnt      lgkmcnt(0)

        ds_read_b32    v0, v1                    read

       

      Dissassembly of newer code

         v_lshlrev_b32  v1, 2, v0

        v_sub_i32      v2, vcc, 0, v0   <------   start:gx in v0; gx = -gx;

        ds_write_b32   v1, v0

        v_lshlrev_b32  v0, 2, v2                  multiply by 4

        s_waitcnt     lgmkcnt(0)                 

        ds_read_b32    v0, v0 offset:252          read from offset 252 with negative address register

        • Re: Gcn compiler bug, lds pointer arithmetic?
          binying

          Have you try it on the latest? It might be fixed!

            • Re: Gcn compiler bug, lds pointer arithmetic?
              drallan

              Hi Binying,

               

              Thanks for the reply.

               

              I have seen the failure on catalysts 12.8, 12.9, 12.10, 12.11beta8, and 12.11beta11.

               

              However, the problem can also depend on how or what you select when you install drivers. The combination of installing the video drivers and/or not the SDK runtime can make a difference, (at least as of 12.9), including the SDK, as in the default installation, will produce the error.

               

              I did confirm the error occurs with all amdocl(64).dll files up through 12.11beta11.

               

              Where is beta13 ?

               

              drallan

            • Re: Gcn compiler bug, lds pointer arithmetic?
              realhet

              Do you think ds_read will fail if it gets a negative address? Isa docs didn't mention it. Does it checks negative addr before adding the 16 bit offset?

              I think the same data and address vreg (v0, v0) could be wrong. At least with memory operations it is bad. But It's a weak argument since it reads only 1 dword...

               

              side note: Just accidentally found the src_lds_direct broadcasted vector parameter. In case we run out of S reg local constants If I only knew it earlier...

                • Re: Gcn compiler bug, lds pointer arithmetic?
                  drallan

                  realhet wrote:

                   

                  Do you think ds_read will fail if it gets a negative address? Isa docs didn't mention it. Does it checks negative addr before adding the 16 bit offset?

                  I think the same data and address vreg (v0, v0) could be wrong. At least with memory operations it is bad. But It's a weak argument since it reads only 1 dword...

                   

                  side note: Just accidentally found the src_lds_direct broadcasted vector parameter. In case we run out of S reg local constants If I only knew it earlier...

                   

                  Hi Realhet,

                   

                  Good question and not clear. I know the negative address produces the wrong answer = 0. The manual says OOR lds addresses will return the value 0, I don't know how the hardware makes the final address. I assume a negative address is interpreted as a large positive address and goes OOR.

                   

                  re: src_lds_direct broadcast. Yes, yet another well thought out gcn feature. I wonder how that compares to all threads reading the same address, which would be a bank conflict?