Hi everybody, here we are for a new topic about OpenCL. Hope someone will answer me deeply.
Now, after the software setup on my laptop has been solved (http://devgurus.amd.com/message/1300783#1300783), I'd like to obtain clarifications about the hardware.
Actually, I have installed an ATI Mobility Radeon HD 5650 and I am consulting your guide (http://developer.amd.com/download/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf). In the first chapter a long discussion is done about the architecture, but it makes me some confusion, honestly. Probably it is me not understanding in a good way.
I'd like to know if the processing elements are considered as the ALU units or something different. Then it is told about Evegreen/Northern Island/Southern Island families (Desktop families, right?), but nothing about Mobility (or Manhattan, don't know), and honestly I don't know if they have the same/similar features from the hardware point of view.
Avoiding to write a long post, I just would like to know:
1) the unit vectors are 4 for all the families (Desktop and Mobility)?
2) the processing elements (not ALUs?) are 16 per vector unit? (page 20 of the previous guide)
3) the ALUs per processing element are 4/5 according to the different families (end of page 21), right? And for HD 5650?
4) what do you mean for "work item"? at page 22 it is written:
"For devices in the Northern Islands and Southern Islands families, these ALUs are arranged in four (in the Evergreen family, there are five) processing elements with arrays of 16 ALUs. Each of these arrays executes a single instruction across each lane for each of a block of 16 work-items".
So a work item corresponds to an ALU unit which previously corresponded to a processing element (page 20)? Which is the physical correspondance of a work item?
5) is there a way to obtain a technical guide for the specific board I have (a sort of manual)? Here (Radeon HD 5000 Series - Wikipedia, the free encyclopedia) there are no informations about processing elements, ALU, and so on. Where can I find them related to the board I have? Moreover, is there a connection between TMU/ROP and processing elements/ALU or something else?
I would like to know these informations for managing the execution process in OpenCL in a clear way, considering work items, work groups, wavefronts and so on, aiming to optimize the design and to reduce the computation time for memory/execution tasks.
1 . Yes, unit vector is 4 for all the families
2. Yes each vector unit consists of 16PEs
4. Work item is nothing but a simple thread. Group of such work items make a wavefront. (Dont compare with ALU, its different )
5. These are specifi to OpenCL not for any particualar board, its a language which you uses to work on those boards.No item idea about TMU/ROP
If you read the programming guide properly will give you more information what you need. Also you can browse to get more materials and ppts.
All the best
Thank you Himanshu for replying me.
I'll try to read the guide deeply. And search for something about Mobility (Manhattan) GPU family since I didn't understand if I can use specs about Evegreen/Northern Island/Southern Island families for the Mobility one, or if there are differences.