cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

alexaverbuch
Journeyman III

rv770 White Paper

Official specifications... or at least a subset of them?

Hi and sorry to spam this post to a few different forums,

Does anyone know where I can find the specifications for the rv770. Including memory regions, their sizes, their access latencies/bandwidth, etc?

I can get a lot of information by crawling the internet and reading every review of the rv770, but I assume there exists an official source for the information that I want... AMD (I assume)  wants to empower developers to use their products?

Any suggestions would be greatly appreciated!

Regards,

Alex Averbuch

0 Likes
12 Replies
Ceq
Journeyman III

rv770 White Paper

I think you may find some interesting documentation about the RV770 in the Stream SDK main page:

http://developer.amd.com/gpu/ATIStreamSDK/Pages/default.aspx

In the download section there are several pdfs, have a look at "AMD R700-Family Instruction Set Architecture".

0 Likes
alexaverbuch
Journeyman III

rv770 White Paper

thanks a lot! perfect

0 Likes
rahulgarg
Adept II

rv770 White Paper

Note that the R700 ISA does not include any information about cache sizes or latencies to cache. The cache sizes are not public AFAIK but speculation is 8kB/SIMD L1 (total 80kB L1) and 64*4=256kB L2 total for RV770.

0 Likes
MicahVillmow
Staff
Staff

rv770 White Paper

Rahul, You can find more information about cache's and how they work in slides that were recently posted of how we optimized ACML-Sgemm for RV670 hardware. R770 has a very similar cache structure except instead of having a 4 way L1 cache, each SIMD gets its own L1 cache.

http://developer.amd.com/gpu_a...on%20Illustration.ppt

More information can be found in documents here:
http://developer.amd.com/gpu/A...ages/Publications.aspx
0 Likes
alexaverbuch
Journeyman III

rv770 White Paper

Thanks everyone,

Regarding on-chip memory, I have been trying to figure out which memory is relevant to GPGPU computation and which is purely (mostly) beneficial to traditional graphics workloads.

Local Shared Memory (16kb per SIMD): This seems to be "general purpose"/"scratch" memory that IS useful for GPGPU and has no coherancy

Global Shared Memory (16kb): This seems to be "general purpose"/"scratch" memory that IS useful for GPGPU and has no coherancy

L1 (8kb per SIMD?) (coherancy?): This is a Texture Cache and is not really suited to GPGPU

L2 (size?) (Read-only no coherancy?): This is connected to the Memory Controller from what I can see so it is (implicitly) used when accessing RAM. Is it also used when accessing Local & Global Shared Memory?

Texture Cache (size?) (coherancy?): does this exist? or are L1 & L2 both "Texture Cache"?

 

If anyone could agree/disagree/discuss my comments above it would be greatly appreciated.

Regards,

Alex Averbuch

0 Likes
MicahVillmow
Staff
Staff

rv770 White Paper

Alexaverbuch,
Your algorithm choice will determine the answer to which memory is important. For example, simple_matmult does not use LDS or GDS and relies on the texture cache and outperforms many if not all matrix mul algorithms on the RV770 that attempt to use LDS. NLM_Denoise also outperforms the equivalent algorithm that uses LDS.

So it isn't necessarily GPGPU/graphics in general that determine what memory you use, but your problem domain and algorithmic choice that should drive the decision.
0 Likes
alexaverbuch
Journeyman III

rv770 White Paper

Hi Micah,

Ok, thanks. But in general is it fair to say that the texture cache is an artifact of graphics workloads and was no intended to be used this way?

Also, is it possible to confirm/deny my other "?" in the above post please?

 

0 Likes
MicahVillmow
Staff
Staff

rv770 White Paper

The slides should give you all of that information. If not please let me know.

Micah
0 Likes
alexaverbuch
Journeyman III

rv770 White Paper

It's not clear to me. Low level development is not my specialty. Also, I am viewing these slides in Open Office and many of the images seem to be poorly formatted... but mostly, I'm inexperienced with all of this still 🙂

If you can clarify the questions please do.

Alex

0 Likes