cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

gouse
Journeyman III

clAmdFft: why 3K case is much slower than 4K?

Hi,

I'm experiencing with clAmdFft and see that FFT of 3072x3072 image is way slower than 4096x4096 (ran with 'clAmdFft.Client.exe -g -o -x 4096 -y 4096 -p 20'):

  • 230ms for 3072x3072 image
  • 33ms for 4096x4096

What causes almost 8 times better performance for 1.7 more data? It should be related to 2^n, I guess. But which way?

Thanks.

0 Likes
1 Reply

Transform length of 4096 (and other pure powers of 2) have been optimized better than lengths with mixed factors (in this case 3072 has a '3' in it). Lengths with mixed factors need more optimization work in the library.