I have filed a ticket for our Software engineers
What are the numbers you are getting with complex-complex large 3d FFTs?
On the Radeon Pro VII: 16.0856 s
On the Radeon VII: 16.4581 s
EDIT: This is with this code.
Apparently FFT is memory bound, and that's why Radeon VII and RP VII perform the same - they both have memory bandwidth of 1GB/s. Try "vkFFT" instead.
View solution in original post
@fsadoughThank you so much!
I understand. I will run some tests on more normal computations and compare the performance between the pro and non-pro GPUs. It will take a bit of time though. I will give vkFFT a try. This tip is very helpful!