cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jclin
Journeyman III

How should I code in OpenCL for fewer load instructions?

I'm using AMD-APP (1214.3). My code in OpenCL is as follows,

    // W is an uint4 variable

    uint4 T = (uint4)(1U, 2U, 3U, 4U);

    T += W;

After compilation, I saw the IL contains multiple addition instructions to form a uint vector;

dcl_literal l16, 0x00000001, 0x00000001, 0x00000001, 0x00000001

dcl_literal l19, 0x00000002, 0x00000002, 0x00000002, 0x00000002

dcl_literal l18, 0x00000003, 0x00000003, 0x00000003, 0x00000003

dcl_literal l17, 0x00000004, 0x00000004, 0x00000004, 0x00000004

        mov r66, l16

        iadd r66, r66.xyz0, l17.000x

        iadd r66, r66.xy0w, l18.00x0

        iadd r66, r66.x0zw, l19.0x00

        iadd r75, r75, r66

So, how could I code in OpenCL to achieve fewer instruction. For example, one instruction load and then iadd, like following

dcl_literal l16, 0x00000001, 0x00000002, 0x00000003, 0x00000004

       move r66, l16

       iadd r75, r75, r66

Thanks

Tags (3)
0 Likes
1 Reply
realhet
Miniboss

Re: How should I code in OpenCL for fewer load instructions?

Hi,

Check the ISA disassemblies! I'm pretty sure, that it short-circuits those obfuscated constant calculations.

There can be many weird things in IL, but the IL->ISA compiler is very good at finding the fastest way do do the math. (unless you're using a HD4xxx )

0 Likes