cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Bdot
Adept III

Using mad() for vector and scalar types?

Hi,

I'm using APP SDK 2.5 and Catalyst 11.11.

In my code I had a line like this:

 

uint4 t,tmp; uint u; ... tmp = t * 4620 + u; I figured my kernel would run a bit faster if I combined the calculation using mad(). However, I did not manage to get it done as t is a vector and the other two operands are scalars. tmp = mad(t, 4620u, u); .\barrett.cl(1631): error: no instance of overloaded function "mad" matches the argument list argument types are: (uint4, uint, uint) tmp = mad(t, 4620u, u); So I tried converting the uints to uint4 (not sure if, when it works, it would still be faster than the original line): tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); .\barrett.cl(1631): error: more than one instance of overloaded function "convert_uint4" matches the argument list: function "convert_uint4(char4) C++" function "convert_uint4(uchar4) C++" function "convert_uint4(short4) C++" function "convert_uint4(ushort4) C++" function "convert_uint4(int4) C++" function "convert_uint4(uint4) C++" function "convert_uint4(long4) C++" function "convert_uint4(ulong4) C++" function "convert_uint4(float4) C++" function "convert_uint4(double4) C++" argument types are: (uint) tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); ^ .\barrett.cl(1631): error: more than one instance of overloaded function "convert_uint4" matches the argument list: function "convert_uint4(char4) C++" function "convert_uint4(uchar4) C++" function "convert_uint4(short4) C++" function "convert_uint4(ushort4) C++" function "convert_uint4(int4) C++" function "convert_uint4(uint4) C++" function "convert_uint4(long4) C++" function "convert_uint4(ulong4) C++" function "convert_uint4(float4) C++" function "convert_uint4(double4) C++" argument types are: (uint) tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); ^ .\barrett.cl(1631): error: no instance of overloaded function "mad" matches the argument list argument types are: (uint4, <error-type>, <error-type>) tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); ^ Can someone please advice how I can get to the desired mad() without adding anything that would consume additional cycles? Why is a scalar auto-expanded in a multiplication with a vector (and also within mul_hi(), for instance), but not when used in mad()? Thanks, Bdot

0 Likes
2 Replies
nou
Exemplar

this should work

mad(t, (uint4)(4620u), (uint4)(u));


but i am not sure if GPU can perform MAD operation on int.

0 Likes

You are right (with both remarks):

 

.\barrett.cl(1631): error: no instance of overloaded function "mad" matches
          the argument list
            argument types are: (uint4, uint4, uint4)
    tmp = mad(t, (uint4)4620u, (uint4)u);

 

The simple cast would create the correct type (if needed), but mad is only available for floating point. This also explains why it is autoexpanded in mul_hi: because mul_hi is defined for integer types.

 

I had seen the integer function mad_hi and simply concluded there was also a mad. I wish OpenCL would start closing the holes in the instructions (mul24_hi, mad are already two examples I'm missing).

0 Likes