AnsweredAssumed Answered

HD7970ghz Peak TFLOPS calculation

Question asked by rcgoodfellow on Apr 10, 2013
Latest reply on Apr 12, 2013 by kittyshen2013

In the document 'AMD Accelerated Parallel Processing OpenCL Programming Guide' provided here


Table 5.3 gives the instructions per cycle (IPC) ratings for various instructions from which we may calculate the peak FLOPS for both single and double precision calculations.  Using the table I calculate the double precision peak FLOPS as


     dp_add_flops = total_alu_count * clock_rate * dp_add_ipc

                            = 2048 * 1.05 GHz * 0.5

                            = 1.0752 TFLOPS


which is roughly in line with the advertised performance, however for single precision I have


     sp_add_flops = total_alu_count * clock_rate * sp_add_ipc

                            = 2048 * 1.05 GHz * 4

                            = 8.6016 TFLOPS


which is exactly double the advertised performance.  What am I missing here? If the single point add IPC is reduced to 2 then the numbers are spot on, however, that does not agree with the specs provided in the document identified above.  Also is there a place where I can find very detailed hardware specifications for my card specifically?