I have not yet had the time to start programming my 4870x2, but I've been reading this forum with great interest. So please excuse my lack of understanding.
The app is fluid hydrodynamics, solving Navier-Stokes with a finite element approach. This is a partial differential equation. The derivatives are computed from cells (elements) that are adjacent in 3-space.
I have been reading that ATI stream processors can only access their own storage, so I don't understand how finite element techniques are mapped to them. My intuition wants multiple stream processors to share storage. For example, a cube in 3-space has six sides, so each processor would have access to the data sets of five others.
So how is this done? I can imagine 3 ways:
1. Each stream processor has access to the data of some small number of other processors, ie., one processor can see it's own data, and 5 others (best)
2. Each replacement kernel can remap stream processors to different storage, wtihout actually moving the data around, while somehow remembering some data ( a few registers?) about the last element it worked on (OK)
3. The data has to come out so the CPU can reshuffle it, then reload the stream card with new data and a new kernel (not a good fit for this!)
Your explanation is appreciated.