cancel
Showing results for
Did you mean:

# Archives Discussions Journeyman III

## Double Precision Exp Function HD5870

I am a postgrad student doing quantum field theory monte carlo simulations and I have recently acquired an HD5870 card not realizing it didn't have full double precision math support yet.  The main function I need from the library is "exp".

In principle I thought one could write one's own code for "exp" out of the basic arithmetic operators (that have double support).  The thing is I did and it gave the required result (running on CPU) but mine was orders of magnitude slower than the math library function.

Is there anyway one can write a full speed exponential function prior to full double precision support, possibly using CAL if it wouldn't require a massive undertaking.

4 Replies First of all there is no native double exp on rv8xx. So it must be implemented using basic operations.

In theory rv8xx has fused multiply add - but ISA documentation doesn't say if it's IEEE compliant.

This is required for first version ( faster ) of exp to work correctly.With true fma error is <1 ulp, without it >50 ulp.

Also I don't now if double fma and ldexp instructions are available now in OpenCL ( ldexp is native on radeons ).

This is probably the fastest exp implementation possible on 5xxx family (~10 mads). Also 1 mad can be removed for <2 ulp precision.

PS. This code is written in C++ to IL converter so thats why there is double1 . Also all constants like "(long double)1/(long double)std::log((long double)2)" should be precomputed and values inserted into OpenCL code.

PS2. If someone from ATI is reading this post then please correct ldexp description in IL docs - because someone who wrote it clearly didn't know how it works.

`double1 exp( double1 x ) { double1 cinv_log2((long double)1/(long double)std::log((long double)2)), cinv_k(1./32.); double1 s,r,m; m = round( cinv_log2*x ); r = cinv_k*fma(-log((long double)2),m,x); // this part must be done with extended precision // taylor seriers expansion s = double1((long double)1/(long double)5040); s = mad(s,r,double1((long double)1/(long double)720)); s = mad(s,r,double1((long double)1/(long double)120)); s = mad(s,r,double1((long double)1/(long double)24)); s = mad(s,r,double1((long double)1/(long double)6)); s = mad(s,r,double1((long double)1/(long double)2)); s = mad(s,r,double1((long double)1/(long double)1)); s = s*r; // s = (1+s)^32 s = mad(s,s,s+s); // this mad can be removed for <2 ulp -> cinv_k=1/16 s = mad(s,s,s+s); s = mad(s,s,s+s); s = mad(s,s,s+s); s = mad(s,s,s+s); s = s + double1(1); return ldexp(s,convert_int1(m)); }` Now the worse part ( and longer ). Second version of exp when there is no true fused mad. This one works also on 4xxx family. The error is ~1ulp ( less <1.5 ulp in my tests ).

For most sensitive part "x - log((long double)2)*m" double-double technique is used to extend precision. Also I would like to point out that this is based on code from qd library ( http://crd.lbl.gov/~dhbailey/mpdist/ ). The only algorithmic modification made by me is much faster ( and probably more "accurate" ) split function.

Again the code is written in C++ to IL converter/compiler, but it is very similiar to OpenCL ( ok templates aren't ).

PS. If the ldexp isn't available atm in OpenCL it can be raplaced by 5 operations ( &,<<,+,&,| ).

`double1 exp( double1 x ) { double1 cinv_log2((long double)1/(long double)std::log((long double)2)), cmln20(-6.931471805599452862e-01), cmln21(-2.319046813846299558e-17), cinv_k(1./32.); double1 s,r,m,p0,p1,r0,r1; m = round( cinv_log2*x ); //r = cinv_k*(x - log(2)*m); // cmln20,cmln21 <- this is -log(2) in double-double representation dd_prod_dbl( p0, p1, cmln20, cmln21, m ); // double-double * double dd_sum_dbl( r0, r1, p0, p1, x ); // double-double + double r = cinv_k*r0; // taylor seriers expansion s = double1((long double)1/(long double)5040); s = mad(s,r,double1((long double)1/(long double)720)); s = mad(s,r,double1((long double)1/(long double)120)); s = mad(s,r,double1((long double)1/(long double)24)); s = mad(s,r,double1((long double)1/(long double)6)); s = mad(s,r,double1((long double)1/(long double)2)); s = mad(s,r,double1((long double)1/(long double)1)); s = s*r; // s = (1+s)^32 s = mad(s,s,s+s); s = mad(s,s,s+s); s = mad(s,s,s+s); s = mad(s,s,s+s); s = mad(s,s,s+s); s = s + double1(1); return ldexp(s,convert_int1(m)); } // // double-double with double // operations // namespace dd { void split( const double1& a, double1& hi, double1& lo ) { // in OpenCL it should be // hi = as_double(as_uint2(a) & uint2(0xFC000000,0xFFFFFFFF)); hi = a & uint2(0xFC000000,0xFFFFFFFF); lo = a - hi; } void split( const double2& a, double2& hi, double2& lo ) { hi = a & uint4(0xFC000000,0xFFFFFFFF,0xFC000000,0xFFFFFFFF); lo = a - hi; } template<typename D> variable<D> two_prod( const variable<D>& a, const variable<D>& b, variable<D>& err) { variable<D> a_hi, a_lo, b_hi, b_lo,p; p = a*b; split(a, a_hi, a_lo); split(b, b_hi, b_lo); //err = ((a_hi * b_hi - p) + a_hi * b_lo + a_lo * b_hi) + a_lo * b_lo; err = mad( a_lo,b_lo, mad(a_hi,b_hi,-p) + mad(a_hi,b_lo,a_lo*b_hi) ); return p; } template<typename D> variable<D> two_sum( const variable<D>& a, const variable<D>& b, variable<D>& err) { variable<D> s,bb; s = a + b; bb = s - a; err = (a - (s - bb)) + (b - bb); return s; } template<typename D> variable<D> quick_two_sum( const variable<D>& a, const variable<D>& b, variable<D>& err) { variable<D> s; s = a + b; err = b - (s - a); return s; } } // dd template<typename D> void dd_prod_dbl( variable<D>& r0, variable<D>& r1, const variable<D>& a0, const variable<D>& a1, const variable<D>& b ) { variable<D> p1,p2; p1 = dd::two_prod( a0, b, p2 ); p2 = p2 + a1*b; p1 = dd::quick_two_sum( p1, p2, p2 ); r0 = p1; r1 = p2; } template<typename D> void dd_sum_dbl( variable<D>& r0, variable<D>& r1, const variable<D>& a0, const variable<D>& a1, const variable<D>& b ) { variable<D> s1,s2; s1 = dd::two_sum( a0, b, s2 ); s2 = s2 + a1; s1 = dd::quick_two_sum( s1, s2, s2 ); r0 = s1; r1 = s2; }`  Staff
hazeman,
Can you point me to what you believe is incorrect? I'll poke at the correct person to get it fixed. Here is the ldexp description from IL docs.

> Puts an exp and a mantissa into a double.

This doesn't make sense ( it's probably sentence taken out of context from someone's explanation of ldexp algorithm )

> The computation is result = source1 * 2 source

should be dst.xy = (2^src1.x)*src0.xy

> Abs and negate modifiers can be used. Output.yx or output.wz are set to the doubeither input produces a NaN.

> Overflow produces inf. Underflow produces 0.

> Src0 is a double (-1.0d < src0.xy < 1.0d).

src0 doesn't have to be in this range !

> Src1.x is an integer from -1024 to +1024.

> Since the second source src1 is an integer exponent, only the neg modifier is ...

I think this is also incorrect. ( if I remember correctly neg doesn't work on ints )

To tell you the truth after reading the text I really wasn't sure if this is the instruction I need. Name was correct but description wasn't.  I had to make some tests to verify what I'm dealing with .

When we are on the subject of IL docs. Please force someone to read the whole text - it has so many spelling errors, missing letters, small logical errors. 