2 Replies Latest reply on Jan 3, 2011 9:33 AM by himanshu.gautam

    Formulas for run time

    ryta1203

      Is there a reason the formulas from the old Stream Programming Guide are no longer in the new OpenCL programming guide?

      I'm assuming that LDS has changed quite a bit since then? No? Yes?

      For instance, DCT has 9 LDS accesses. Is it still calculated the same way?

      Can someone point me to the 5870 docs?

        • Formulas for run time
          ryta1203

          None of the 68 people who have viewed this thread know? Bummer.

          BTW, I found the 5870 docs that talk about LDS, it appears that it's the same as the 3870, at least for calculating latency as described in the Stream Programming Guide (not OpenCL Prog. Guide)?

            • Formulas for run time
              himanshu.gautam

              In openCL the LDS had to be emulated using global memory for 4xxx devices.

              In 5xxx devices the LDS was changed in a way to confirm to the properties set by openCL spec, so performance was boosted by using LDS.

              One of the improvements in Cayman is more efficiently moving data into the LDS from memory. In Cypress, moving data into the LDS takes a memory instruction and an ALU instruction. Data must first be loaded from memory into the register files and then subsequently moved from the register files into the LDS. Cayman can directly fetch from memory into the LDS, eliminating the ALU instruction altogether.