The FFT length in the clFFT library is limited to 2^24 SP and 2^22 DP respectively. What was the reason to do so? I'd like to tackle this in the open source version. But if there's a bigger limitation problem behind this (e.g. due to OpenCL itself), I'd like to know that first.
size_t SP_MAX_LEN = 1 << 24;
size_t DP_MAX_LEN = 1 << 22;
I have forwarded your question to the AMD development team for math libraries. Awaiting their response.
Just out of curiosity, these limits seem to be too big for all practical purposes. When could these be violated and bigger limits be required?
I have confirmed it with our math library team. This limit is artificially imposed because of the size of some internal registers.
We would appreciate if you can point out some applications that require an FFT of size bigger than the current limit, and where this limit can hurt.