cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

ryta1203
Journeyman III

Formulas for run time

Is there a reason the formulas from the old Stream Programming Guide are no longer in the new OpenCL programming guide?

I'm assuming that LDS has changed quite a bit since then? No? Yes?

For instance, DCT has 9 LDS accesses. Is it still calculated the same way?

Can someone point me to the 5870 docs?

0 Likes
2 Replies
ryta1203
Journeyman III

None of the 68 people who have viewed this thread know? Bummer.

BTW, I found the 5870 docs that talk about LDS, it appears that it's the same as the 3870, at least for calculating latency as described in the Stream Programming Guide (not OpenCL Prog. Guide)?

0 Likes

In openCL the LDS had to be emulated using global memory for 4xxx devices.

In 5xxx devices the LDS was changed in a way to confirm to the properties set by openCL spec, so performance was boosted by using LDS.

One of the improvements in Cayman is more efficiently moving data into the LDS from memory. In Cypress, moving data into the LDS takes a memory instruction and an ALU instruction. Data must first be loaded from memory into the register files and then subsequently moved from the register files into the LDS. Cayman can directly fetch from memory into the LDS, eliminating the ALU instruction altogether.

0 Likes