cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

meriken
Adept III

GCN Assembly: Replacing addr64 Specifier in MUBUF Instructions for GCN 1.2

I was trying to port my code in GCN 1.0/1.1 into GCN 1.2 and just realized that the ADDR64 specifier is not allowed for MUBUF instructions in GCN 1.2. Is there a simple way to rewrite MUBUF instructions like this one in GCN 1.2?

buffer_load_dword v4, v[44:45], s[32:35], 0 addr64

Tags (3)
0 Likes
3 Replies
meriken
Adept III

Re: GCN Assembly: Replacing addr64 Specifier in MUBUF Instructions for GCN 1.2

What I ended up doing was to rewrite a section of the code that contains the MUBUF instruction in question.

Example:

/* with addr64  */

        v_add_i32       v44, vcc, s4, v44

        v_mov_b32       v46, s5

        v_addc_u32      v45, vcc, v46, v45, vcc

        buffer_load_dword v4, v[44:45], s[32:35], 0 addr64

/* without addr64 */

        v_mov_b32       v126, v44

        s_add_i32       s76, s32, s4

        s_addc_u32      s77, s33, s5

        s_mov_b64       s[78:79], s[34:35]

        v_add_i32       v44, vcc, s4, v44

        v_mov_b32       v46, s5

        v_addc_u32      v45, vcc, v46, v45, vcc

        buffer_load_dword v4, v126, s[76:79], 0 offen

Altough the code runs fine, this solution feels rather hackish.

If somebody could share a better solution, I would really appreciate.

0 Likes
realhet
Miniboss

Re: GCN Assembly: Replacing addr64 Specifier in MUBUF Instructions for GCN 1.2

What about bit #21 of the second MUBUF instruction DWORD? It is unused on all GCN versions. Maybe they put the ADR64 bit there on GCN1.2...

0 Likes
meriken
Adept III

Re: GCN Assembly: Replacing addr64 Specifier in MUBUF Instructions for GCN 1.2

I wish AMD had kept 64-bit addressing in MUBUF instructions like that... What I see in disassembled codes for GCN 1.2 is that drivers generate FLAT_* instructions instead of MUBUF instructions. On a related note, I just hope future AMD drivers would be compatible with 32-bit OpenCL binaries as 32-bit addressing makes things a whole lot simpler.

0 Likes