Bdot

Using mad() for vector and scalar types?

Discussion created by Bdot on Dec 16, 2011
Latest reply on Dec 16, 2011 by Bdot

Hi,

I'm using APP SDK 2.5 and Catalyst 11.11.

In my code I had a line like this:

 

uint4 t,tmp; uint u; ... tmp = t * 4620 + u; I figured my kernel would run a bit faster if I combined the calculation using mad(). However, I did not manage to get it done as t is a vector and the other two operands are scalars. tmp = mad(t, 4620u, u); .\barrett.cl(1631): error: no instance of overloaded function "mad" matches the argument list argument types are: (uint4, uint, uint) tmp = mad(t, 4620u, u); So I tried converting the uints to uint4 (not sure if, when it works, it would still be faster than the original line): tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); .\barrett.cl(1631): error: more than one instance of overloaded function "convert_uint4" matches the argument list: function "convert_uint4(char4) C++" function "convert_uint4(uchar4) C++" function "convert_uint4(short4) C++" function "convert_uint4(ushort4) C++" function "convert_uint4(int4) C++" function "convert_uint4(uint4) C++" function "convert_uint4(long4) C++" function "convert_uint4(ulong4) C++" function "convert_uint4(float4) C++" function "convert_uint4(double4) C++" argument types are: (uint) tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); ^ .\barrett.cl(1631): error: more than one instance of overloaded function "convert_uint4" matches the argument list: function "convert_uint4(char4) C++" function "convert_uint4(uchar4) C++" function "convert_uint4(short4) C++" function "convert_uint4(ushort4) C++" function "convert_uint4(int4) C++" function "convert_uint4(uint4) C++" function "convert_uint4(long4) C++" function "convert_uint4(ulong4) C++" function "convert_uint4(float4) C++" function "convert_uint4(double4) C++" argument types are: (uint) tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); ^ .\barrett.cl(1631): error: no instance of overloaded function "mad" matches the argument list argument types are: (uint4, <error-type>, <error-type>) tmp = mad(t, convert_uint4(4620u), convert_uint4(u)); ^ Can someone please advice how I can get to the desired mad() without adding anything that would consume additional cycles? Why is a scalar auto-expanded in a multiplication with a vector (and also within mul_hi(), for instance), but not when used in mad()? Thanks, Bdot

Outcomes