I'm after more precision than double.
There is a couple of double-double libraries for CUDA
Another is open source https://code.google.com/p/gpuprec/
Are there any for opencl?
I only need basic arithmetic and inverse square root.
What I really needs is the vector form (i.e.doubledouble4).
I believe the older amd graphics cards used 4 float ALU's to do double precision.
Is this exposed in some low level manner that would lend it to higher levels of precision.
i.e. quad from 4 double precision ALU's