Nvidia Cuda fft vs tuned ACML fft

GPU number crunching versus CPU

i don't have an Nvidia 8X card so i can't benchmark this but Nvidia has a Cuda library that uses the massive pipeline on its cards and they have an FFT in their lib.
was wondering if any has the time and effort to do a benchmark?