I was trying to port my code in GCN 1.0/1.1 into GCN 1.2 and just realized that the ADDR64 specifier is not allowed for MUBUF instructions in GCN 1.2. Is there a simple way to rewrite MUBUF instructions like this one in GCN 1.2?
buffer_load_dword v4, v[44:45], s[32:35], 0 addr64
What I ended up doing was to rewrite a section of the code that contains the MUBUF instruction in question.
Example:
/* with addr64 */
v_add_i32 v44, vcc, s4, v44
v_mov_b32 v46, s5
v_addc_u32 v45, vcc, v46, v45, vcc
buffer_load_dword v4, v[44:45], s[32:35], 0 addr64
/* without addr64 */
v_mov_b32 v126, v44
s_add_i32 s76, s32, s4
s_addc_u32 s77, s33, s5
s_mov_b64 s[78:79], s[34:35]
v_add_i32 v44, vcc, s4, v44
v_mov_b32 v46, s5
v_addc_u32 v45, vcc, v46, v45, vcc
buffer_load_dword v4, v126, s[76:79], 0 offen
Altough the code runs fine, this solution feels rather hackish.
If somebody could share a better solution, I would really appreciate.
What about bit #21 of the second MUBUF instruction DWORD? It is unused on all GCN versions. Maybe they put the ADR64 bit there on GCN1.2...
I wish AMD had kept 64-bit addressing in MUBUF instructions like that... What I see in disassembled codes for GCN 1.2 is that drivers generate FLAT_* instructions instead of MUBUF instructions. On a related note, I just hope future AMD drivers would be compatible with 32-bit OpenCL binaries as 32-bit addressing makes things a whole lot simpler.