cancel
Showing results for 
Search instead for 
Did you mean: 

General Discussions

tinux
Adept I

[rocfft] incorrect results for certain (large) dimensions in 3D FFTs

Hello everyone

I am developing a simulation tool for linear and nonlinear propagation of ultrashort laser pulses in 3D. It uses a split-step method to solve the generalized nonlinear Schrödinger equation. This is done by applying effects to a 3d array in time t, and both the x and y spatial axes, where t is the slow axis, and x the fast axis (and y the medium). Most effects are applied in one of the following spaces

- time/space,

- frequency/space, and

- frequency/reciprocal space,

which means I need to convert, i.e. Fourier-transform, between them, in order to apply certain effects.

 

For performance reason, the entire code is executed on a GPU, including the different FFTs, where I make use of rocfft.

 

I have now realized that the 3D-FFT does not always give me correct results when the 3d-array-lengths become larger. For debugging, I wrote some simple tests to see where this happens. You can find the code in [this github repo](https://github.com/t1nux/roc_fft_bug).

What I do there is

- define 1D, 2D, and 3D FFTs for handling the transforms between the spaces mentioned above,

- initialize the data to a 3D-step-function and store for plotting

- copy to GPU

- do a 3D FFT

- do a 2D FFT in x and y, followed by a 1d FFT in t (should be the same as the 3D FFT)

- copy the data back and store for plotting

 

Now, when I plot the 3D FFT, the 2D+1D FFT, and the 3D FFT made in Numpy, the results typically look identical, unless the dimensions get to large. You can find python notebook in the repo mentioned above to illustrate all this.

 

For instance, for 3D array sizes (t, y, x) of

- [2^8, 2^8, 2^11]: everything looks good,

- [2^8, 2^8, 2^12]: the 3D FFT is incorrect, but the 2D+1D FFT seems correct,

- [2^8, 2^12, 2^8]: both the 3D and 2D+1D FFT are incorrect.

 

It seems that it fails if the combined length in the different dimensions exceed a certain value, but I cannot pinpoint where the threshold is. What I found confusing, is that

- [2^4, 2^12, 2^8]: works

- [2^8, 2^12, 2^8]: only does not work in the 3D FFT case

- [2^8, 2^*, 2^12]: does not work for 3D and 2D+1D FFT cases

 

All this was tested on 2 systems:

- system 1

  - Arch Linux

  - Kernel 6.7.5-arch1-1

  - ROCM 6.0.0

  - GPU: RX 7900 XTX

- system 2

  - Arch Linux

  - Kernel 6.7.4-arch1-1

  - ROCM 6.0.0

  - GPU: Radeon VII Pro

 

I would appreciate if someone could test this.

 

Thanks!

 

EDIT: Typo

2 Replies
tinux
Adept I

I can also confirm that this is happening on an AMD RX 6900 XT GPU.

tinux
Adept I

Apologies, I just realized I posted this in general discussion and not ROCM. I could not delete this one here. If a moderator can, please do so.

0 Likes