Not from AMD, but here is my perception about the issues you mentioned :
1- Yes, there is currently no implementation of double precision transcendentals in hardware nor in software. Hopefully AMD will release a software implementation soon...
2- The topic you mentioned was about floating-point constants in the code. That is, if you write
double a = 1.23456789012345;
it will be truncated to something like
double a = 1.2345679f;
Passing double variables using function parameters, stream or constant memory should work correctly.
3- From my own tests on a 3850, it appears that the double precision MAD is actually fused (computes a*b+c exactly, then rounds at the end), though I have not seen it mentioned anywhere. So you might end up with a result more precise than on the CPU if you use it.
DP division is implemented in software using the fused MAD (FMA), but does not seem to provide correct rounding in every case, so it may yield different results than the CPU.
I did ask about rounding modes a little while ago in this post.
The suggestion there was that mads weren't fused but that the multiplication wasn't truncated either (i.e. a mad is just the same as a rounded mul followed by a rounded add). I tested this just recently for single precision and this was indeed what happened, but I didn't try double precision. If it is fused then that'd be great!
thank you everyone for your replies.
I hope also someone from AMD would clear these issues you have raised, specially how does the actual hardware does both transcendental and basic DP maths.
BTW, does the 4870 exepct to have a change in the DP FP math hardware?