The following seems to be a bug in the gcn compiler, starting with catalyst 12.9, and does not occur on other architectures like Cayman and Barts. The test program fills an array with some data and then reads it back in reverse order. To do this, the compiler computes the offset (63 - gx) to pointer pB where gx >= 0 && < 64. The problem occurs because the new compiler uses the constant value 63 as an offset in the ds_read_b32 instruction, which then causes the register address of ds_read_b32 to be negative!. The old compiler does not use the ds_read_b32 offset option. Several similar expressions cause the same problem. Unfortunately this is a pretty common 'C' expression.....
result = *(pB + (63 - gx));
Just a comment, gcn is still relatively new and being optimized so bugs can happen however, the compiler is often beyond amazing in the code it produces and getting faster. Go compiler team.
__kernel void localpointer_rw(global int *in, global int *out)
int i, gx = get_global_id(0); //gx goes from 0 to 63
local int buf,*pB;
buf[gx] = gx; //assign 64 numbers to all buf[n] locations
pB = &buf; //make a pointer
i = *(pB+(63-gx)); //ERROR! reads only when gx == 0 !!
out[gx] = i;
Dissassembly from (good) old compiler
v_sub_i32 v1, vcc, 0, v0 <------ start:gx in v0; gx = -gx;
v_lshlrev_b32 v1, 2, v1 mmultiply by 4
v_add_i32 v1, vcc, 0x000000fc, v1 add constant offset 252
v_lshlrev_b32 v2, 2, v0
ds_write_b32 v2, v0
ds_read_b32 v0, v1 read
Dissassembly of newer code
v_lshlrev_b32 v1, 2, v0
v_sub_i32 v2, vcc, 0, v0 <------ start:gx in v0; gx = -gx;
ds_write_b32 v1, v0
v_lshlrev_b32 v0, 2, v2 multiply by 4
ds_read_b32 v0, v0 offset:252 read from offset 252 with negative address register