cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jcpalmer
Adept I

Texture Cache for Future GPU Versions

I have decided to use Images/Textures instead of global memory to hold my read only, database like, tables (about 20 mb).  This a commercial application that does not have control over what graphics card the customer chooses.  I could do about 50 billion texel lookups in the course of 7600 (512 x 200) calls to one of my kernels, so I/O is very important to me.

Using the very wide memory bus effectively is not likely to be achievable across GPU vendors using global memory, due to how specific you might need to be in terms of your data layout - kernel design - work group sizing.  Penalties might be quite severe, if things are not absolutely perfect.

Since Samplers get 4 values a time (RGBA), you are guaranteed to get at least 4X performance.  This brings me to the bonus part, the texture cache.  I have a couple of questions about AMD's texture caching for OpenCL.  Feel free to just describe how it works for OpenGL, and add disclaimers.  I just want hints.

- What is the ballpark size of the texture cache for OpenStreams GPU's? 16 kb?

- When a cache miss is encountered at let's say (10, 0) with a RGBA float image, size(7000, 1),  what are the address that will be in the cache afterward?

Thank you!

Jeff Palmer

0 Likes
2 Replies
n0thing
Journeyman III

In future GPUs ( read 5xxx series ), size of L1 texture cache is 8KB per SIMD wheres L2 cache is 256KB per memory controller ( 4 x 64bit memory controllers )

Source : http://www.techpowerup.com/reviews/AMD/HD_5000_Leaks/3.html

See the presentation slide titled Stream Computing.

0 Likes

Thanks Ms. Headshots,

I saw that there is also an on-chip, 64kb Global Data Share.  Future ATI GPU's should have little, if any, issue with the methods of other vendor's to get high bandwidth with Global Memory, while at the same time not imposing any of their own.

I am still going to use my sham, single row (possibility wrapped), image technique, because it will be pretty easy to just switch them to global memory some day.  Designing an application around a specific vendors OpenCL implementation is a liability.  The other reason to use images for now is due to how OpenCL might be implemented on pre-existing hardware.

0 Likes