I did some experiments with my current GPUs and I'm becoming more and more interested of the OpenCL platform. I am in scientific computing (so: matrices, geometrical algorithms, etc.)

However, I'm yet to get my hands on a high powered ATI card (like 5870/5970). I was intrigued by the incoming nVidia Fermi/Tesla. But the relative pricing and the stats of the nvidia seems mismatched if we compare to the ATI option.

From stats gathered on the net, I get for Gflops

- GTX280: single=622 double=78
- 5970: single=4600 double=928
- (Current) tesla C1060: single=933 double=78
- (New) GTX480: single=1344 double=168 [EDIT]
- (Future) tesla C2070: single=N/A double=630

Approx pricing: GTX280=450, 5970=650, tesla=1700

In the past, I was kind of partial toward nVidia... but these number are totally ludicrous!

I know this is only theoretical throughtput and real-life OpenCL code will not touch those, but even then... I would simply say wow to the 5970 (and its way cheaper than any Tesla).

Am I missing anything obvious here? did I get a stat wrong? is double-precision performance on ATI "that good" ?

Is there a catch? something like: on ATI the memory accesses would need to be coalesced perfectly (whereas, on GT200, the coalesced restrictions were lowered, in comparison to G80)

well if you need DP support. then you should know that with nVidia you can get "full" DP speed only with tesla card.

Geforce 480 have 1344.96 GFLOPS in single precision but only 168.12 GFLOPS in DP according to this article that only 1/8 of single (for tesla it will be 1/2).

ATI Radeon have 1/5 performance in DP.