3 Replies Latest reply on Nov 12, 2012 11:41 AM by bragadeesh

    problems with clAmdFft

    romein

      I tried clAmdFft on different platforms/devices, and get correct results on AMD GPUs but incorrect results on NVIDIA GPUs.  Also, the results on CPUs are correct only when using Intel's OpenCL platform, but incorrect when using AMD's OpenCL platform.  Am I doing something wrong, or is this a bug in the clAmdFft library (1.8.291) ?  See attached source code --- the correct answer for this 4-point fft should be (16,20) (-8,0) (-4,-4) (0,-8)

       

      Program output for various platform/device combinations:

      platform: AMD Accelerated Parallel Processing, device: Tahiti: (16,20) (-8,0) (-4,-4) (0,-8)

      platform: AMD Accelerated Parallel Processing, device: Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz: (2,4) (6,16) (10,12) (14,8)

      platform: Intel(R) OpenCL, device: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz: (16,20) (-8,0) (-4,-4) (0,-8)

      platform: NVIDIA CUDA, device: Tesla K10.G2.8GB: (6,8) (-4,-4) (6,8) (-4,-4)

      platform: NVIDIA CUDA, device: Tesla K10.G2.8GB: (6,8) (-4,-4) (6,8) (-4,-4)

      platform: NVIDIA CUDA, device: GeForce GTX 680: (6,8) (-4,-4) (6,8) (-4,-4)

        • Re: problems with clAmdFft
          binying

          "incorrect when using AMD's OpenCL platform":

          more detail on the software?

            • Re: problems with clAmdFft
              romein

              This is a small program that just performs a simple 4-points FFT.  It illustrates that the clAmdFft library returns wrong results on Nvidia devices (as well as on CPUs when using the AMD OpenCL runtime).  As I do not have the sources of the library, I cannot see what goes wrong.

            • Re: problems with clAmdFft
              bragadeesh

              Hi romein,

               

              Thanks for the test case. This is a known issue and the problem is not in the library code. It is in AMD's OpenCL runtime and driver stack that affects only the CPU device. You may have guessed that the problem is not in the library code as it works on GPU with our OpenCL platform and also on CPU with Intel's OpenCL platform.

               

              The problem has been fixed. But unfortunately it will be made available only in the next version of the Catalyst driver release.