3 Replies Latest reply on Apr 22, 2010 5:23 AM by noxnet

    How to calculate theoretically possible GFLOPS

    noxnet

      I'm interessted in how theoretically possible GFLOPS are calculated concerning ATI cards?

      Lets consider an HD5850 with 1440 Streaming Processing Units, a clock speed of 725 MHz. I guess these data must be sufficient to calculate GFLOPS.

      According to AMD the HD5850 has:

      2.09 TeraFLOPS in single sre. and 418 GigaFLOPS in double pre.

      Concerning a Mulit-Core CPU it is rather simple to calculate GFLOPS. Just multiply clock_rate with cores * 4 (SSE) = double pre. - single pre = * 2.

      So a 3.0 GHz dual-core CPU is capable of

      3 * 2 * 4 = 36 GFLOPS (dp) and 72 GFLOPS (sp)

       

      I read that 5 SPUs build one 5D-Shader Unit, resulting in 288 5D Shader Units on an HD5850.

      Can anyone explain this to me?

      What does 5D Shader-Unit mean?

      Comparing SPUs to CUDA Cores?

        • How to calculate theoretically possible GFLOPS
          bpurnomo

          For HD5850, there are 1440 Streaming Processing Units (SPU) with a clock speed of 725 MHz.

          Each of the 1440 SPU can do 1 mad (multiply and add) operation per cycle (2 floating point operations per cycle)

          So you have 1440 * 2 ops = 2880 ops per cycle.

          Then, you multiply by the clock speed to get the flops (floating point operations per second).

          2880 * 725 MHz = 2088000 Mflops = 2.088 Teraflops.

           

          • How to calculate theoretically possible GFLOPS
            n0thing

             

            Originally posted by: noxnet

            I read that 5 SPUs build one 5D-Shader Unit, resulting in 288 5D Shader Units on an HD5850.

             

            Can anyone explain this to me? What does 5D Shader-Unit mean?

             

            Comparing SPUs to CUDA Cores?



            A thread-processor can execute upto 5 independent instructions simultaneously in 1 clock IF the shader compiler is able to find these instructions. Otherwise your processor would not be utilized of its full potential.

              • How to calculate theoretically possible GFLOPS
                noxnet

                Thanks for your quick replys!

                So an HD5850 has 288 Thread Processors with 1440 SPUs in total.

                How are these Thread Processors further splited into compute units? According to OpenCL queries an HD5450 with 80 SPUs has 2 compute units.

                I guess concerning double precision 5 SPUs are needed for doing 1 dp calculation.