AnsweredAssumed Answered

Gcn compiler bug, lds pointer arithmetic?

Question asked by drallan on Dec 15, 2012
Latest reply on Dec 19, 2012 by drallan

The following seems to be a bug in the gcn compiler, starting with catalyst 12.9, and does not occur on other architectures like Cayman and Barts. The test program fills an array[64] with some data and then reads it back in reverse order. To do this, the compiler computes the offset (63 - gx) to pointer pB where gx >= 0 && < 64. The problem occurs because the new compiler uses the constant value 63 as an offset in the ds_read_b32 instruction, which then causes the register address of ds_read_b32 to be negative!. The old compiler does not use the ds_read_b32 offset option. Several similar expressions cause the same problem.  Unfortunately this is a pretty common 'C' expression.....

 

                 result = *(pB + (63 - gx));

 

Just a comment, gcn is still relatively new and being optimized so bugs can happen however, the compiler is often beyond amazing in the code it produces and getting faster. Go compiler team.

 

__kernel void localpointer_rw(global int *in, global int *out)

{

    int i, gx = get_global_id(0);   //gx goes from 0 to 63

    local int buf[64],*pB;

 

     buf[gx] = gx;            //assign 64 numbers to all buf[n] locations

     pB = &buf[0];            //make a pointer

     i = *(pB+(63-gx));       //ERROR! reads only when gx == 0 !!

     out[gx] = i;

}

 

Dissassembly from (good) old compiler

  v_sub_i32      v1, vcc, 0, v0   <------  start:gx in v0; gx = -gx;

  v_lshlrev_b32  v1, 2, v1                 mmultiply by 4

  v_add_i32      v1, vcc, 0x000000fc, v1   add constant offset 252

  v_lshlrev_b32  v2, 2, v0

  ds_write_b32   v2, v0

  s_waitcnt      lgkmcnt(0)

  ds_read_b32    v0, v1                    read

 

Dissassembly of newer code

   v_lshlrev_b32  v1, 2, v0

  v_sub_i32      v2, vcc, 0, v0   <------   start:gx in v0; gx = -gx;

  ds_write_b32   v1, v0

  v_lshlrev_b32  v0, 2, v2                  multiply by 4

  s_waitcnt     lgmkcnt(0)                 

  ds_read_b32    v0, v0 offset:252          read from offset 252 with negative address register

Outcomes