cancel
Showing results for 
Search instead for 
Did you mean: 

ROCm Discussions

tinux
Adept I

[rocfft] incorrect results for certain (large) dimensions in 3D FFTs

Hello everyone

I am developing a simulation tool for linear and nonlinear propagation of ultrashort laser pulses in 3D. It uses a split-step method to solve the generalized nonlinear Schrödinger equation. This is done by applying effects to a 3d array in time t, and both the x and y spatial axes, where t is the slow axis, and x the fast axis (and y the medium). Most effects are applied in one of the following spaces

  • time/space,
  • frequency/space, and
  • frequency/reciprocal space,

which means I need to convert, i.e. Fourier-transform, between them, in order to apply certain effects.

For performance reason, the entire code is executed on a GPU, including the different FFTs, where I make use of rocfft.

I have now realized that the 3D-FFT does not always give me correct results when the 3d-array-lengths become larger. For debugging, I wrote some simple tests to see where this happens. You can find the code in this github repo.

What I do there is

  • define 1D, 2D, and 3D FFTs for handling the transforms between the spaces mentioned above,
  • initialize the data to a 3D-step-function and store for plotting
  • copy to GPU
  • do a 3D FFT
  • using the initial 3D-step-function again, do a 2D FFT in x and y, followed by a 1D FFT in t (should be the same as the 3D FFT above)
  • copy the data back and store for plotting

Now, when I plot the 3D FFT, the 2D+1D FFT, and the 3D FFT made in Numpy, the results typically look identical, unless the dimensions get to large. You can find a python notebook in the repo mentioned above to illustrate all this. In the notebook a number of 3D-array-sizes are give that do not work correctly.

For instance, for 3D array sizes (t, y, x) of

  • [2^8, 2^8, 2^11]: everything looks good,
  • [2^8, 2^8, 2^12]: the 3D FFT is incorrect, but the 2D+1D FFT seems correct,
  • [2^8, 2^12, 2^8]: both, the 3D and 2D+1D FFT are incorrect.

It seems that it fails if the combined lengths in the different dimensions exceed a certain value, but I cannot pinpoint where the threshold is. What I found confusing, is that

  • [2^4, 2^12, 2^8]: works
  • [2^8, 2^12, 2^8]: only does not work in the 3D FFT case
  • [2^8, 2^8, 2^12]: does not work for 3D or 2D+1D FFT cases

All this was tested on 3 systems with 3 different GPUs:

  • system 1
    • Arch Linux
    • Kernel 6.7.5-arch1-1
    • ROCM 6.0.0
    • GPU: RX 7900 XTX
  • system 2
    • Arch Linux
    • Kernel 6.7.4-arch1-1
    • ROCM 6.0.0
    • GPU: Radeon VII Pro
  • system 3
    • Arch Linux
    • Kernel 6.7.6-arch1-1
    • ROCM 6.0.0
    • GPU: RX 6900 XT

I would appreciate if someone could test this.

Thanks!

0 Replies