I am a postgrad student doing quantum field theory monte carlo simulations and I have recently acquired an HD5870 card not realizing it didn't have full double precision math support yet. The main function I need from the library is "exp".

In principle I thought one could write one's own code for "exp" out of the basic arithmetic operators (that have double support). The thing is I did and it gave the required result (running on CPU) but mine was orders of magnitude slower than the math library function.

Is there anyway one can write a full speed exponential function prior to full double precision support, possibly using CAL if it wouldn't require a massive undertaking.

First of all there is no native double exp on rv8xx. So it must be implemented using basic operations.

In theory rv8xx has fused multiply add - but ISA documentation doesn't say if it's IEEE compliant.

This is required for first version ( faster ) of exp to work correctly.With true fma error is <1 ulp, without it >50 ulp.

Also I don't now if double fma and ldexp instructions are available now in OpenCL ( ldexp is native on radeons ).

This is probably the fastest exp implementation possible on 5xxx family (~10 mads). Also 1 mad can be removed for <2 ulp precision.

PS. This code is written in C++ to IL converter so thats why there is double1 . Also all constants like "(long double)1/(long double)std::log((long double)2)" should be precomputed and values inserted into OpenCL code.

PS2. If someone from ATI is reading this post then please correct ldexp description in IL docs - because someone who wrote it clearly didn't know how it works.