1 Reply Latest reply on May 1, 2011 9:41 PM by himanshu.gautam

    Aggregated Throughput at 6900 of mul + mulhi

    diepchess
      A number of questions on 6900 gpu streamcore capabilities

      Good Afternoon,

       

      In the AMD Acceleratd Parallel Processing OpenCL Programming Guide 

      at page 119 section 4.13.1 table 4.14 it shoes at integer inst rates

      a total throughput of 1 mul for each 5 PE's at Cypress.

      Now i'm interested in this same table for Cayman and most specifically i'm interested in the aggregated throughput of mulhi + mul each cycle at a streamcore.

       

      This as i received contradictary information there. It was my understanding it is possible to schedule 2 mul's per cycle per streamcore at Cayman. Is that correct?

       

      If not, is the aggregated number of mul+mulhi maybe 2 then for the 6900 series?

       

      This as this is a huge difference for multiplication code, namely 32 bits output per cycle per streamcore versus 64 bits output per cycle per streamcore.

       

      Regards,

      Vincent