I've been trying to write GCN ISA assembly code by hand and I just can't get the "DS_" instructions to work.
The docs said that the address shouldn't be the same in all threads, because it causes conflicts.
I tried to load the global id into the address register so there is no conflict, but it didn't work.
I also tried to initialize m0, as the doc says.
Here is what I tried:
; Initially v2 contains the global id
v_mov_b32 v6, v2
v_mul_i32_i24 v6, 4, v6
v_mov_b32 v7, 99
v_mov_b32 b8, 0
s_mov_b32 m0, 0xFFFFFFFF
; ds_write's operands: (vdst) (addr) (data0) (data1)
; v5 is just a placeholder, it shouldn't be used I think
v_mov_b32 v5, 0
ds_write_b32 v5, v6, v7, v7
ds_read_b32 v8, v6, v5, v5
I tried many variations of the above code, but in the end v8 always remains 0.
Does anybody know what I'm doing wrong?
Thanks in advance!