Looking for OpenCL-centric architecture block diagram(s) of Radeon 4870
I am learning OpenCL on a PhenomII+Radeon4870 platform and have been following this presentation series:
http://www.macresearch.org/opencl_episode4
In this episode, the presenter does a really good job breaking down the processing elements and their hierarchy for his Nvidia GPU (see slides 7-10 of the PDF, and if you have time the corresponding part of the video is excellent).
Is there any such information available for my GPU? I've looked at the ATI site, but all the information on my card seems to be in terms of graphical primitives (shaders, etc.).
I am especially curious about the "warp" structure (the parts where all threads do the lock-step execution of the exact same code), and how many seperate units of those things I have that can be running different lock-step kernel groups at the same time.