2 Replies Latest reply on Dec 16, 2011 5:30 PM by Bdot

    Using mad() for vector and scalar types?

    Bdot

      Hi,

      I'm using APP SDK 2.5 and Catalyst 11.11.

      In my code I had a line like this:

       

      uint4 t,tmp; uint u; ... tmp = t * 4620 + u; I figured my kernel would run a bit faster if I combined the calculation using mad(). However, I did not manage to get it done as t is a vector and the other two operands are scalars. tmp = mad(t, 4620u, u); .\barrett.cl(1631): error: no instance of overloaded function "mad" matches the argument list argument types are: (uint4, uint, uint) tmp = mad(t, 4620u, u); So I tried converting the uints to uint4 (not sure if, when it works, it would still be faster than the original line): tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); .\barrett.cl(1631): error: more than one instance of overloaded function "convert_uint4" matches the argument list: function "convert_uint4(char4) C++" function "convert_uint4(uchar4) C++" function "convert_uint4(short4) C++" function "convert_uint4(ushort4) C++" function "convert_uint4(int4) C++" function "convert_uint4(uint4) C++" function "convert_uint4(long4) C++" function "convert_uint4(ulong4) C++" function "convert_uint4(float4) C++" function "convert_uint4(double4) C++" argument types are: (uint) tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); ^ .\barrett.cl(1631): error: more than one instance of overloaded function "convert_uint4" matches the argument list: function "convert_uint4(char4) C++" function "convert_uint4(uchar4) C++" function "convert_uint4(short4) C++" function "convert_uint4(ushort4) C++" function "convert_uint4(int4) C++" function "convert_uint4(uint4) C++" function "convert_uint4(long4) C++" function "convert_uint4(ulong4) C++" function "convert_uint4(float4) C++" function "convert_uint4(double4) C++" argument types are: (uint) tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); ^ .\barrett.cl(1631): error: no instance of overloaded function "mad" matches the argument list argument types are: (uint4, <error-type>, <error-type>) tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); ^ Can someone please advice how I can get to the desired mad() without adding anything that would consume additional cycles? Why is a scalar auto-expanded in a multiplication with a vector (and also within mul_hi(), for instance), but not when used in mad()? Thanks, Bdot

        • Using mad() for vector and scalar types?
          nou

          this should work

          mad(t, (uint4)(4620u), (uint4)(u));


          but i am not sure if GPU can perform MAD operation on int.

            • Using mad() for vector and scalar types?
              Bdot

              You are right (with both remarks):

               

              .\barrett.cl(1631): error: no instance of overloaded function "mad" matches
                        the argument list
                          argument types are: (uint4, uint4, uint4)
                  tmp = mad(t, (uint4)4620u, (uint4)u);

               

              The simple cast would create the correct type (if needed), but mad is only available for floating point. This also explains why it is autoexpanded in mul_hi: because mul_hi is defined for integer types.

               

              I had seen the integer function mad_hi and simply concluded there was also a mad. I wish OpenCL would start closing the holes in the instructions (mul24_hi, mad are already two examples I'm missing).