just one question regarding the design of the radeon mobility hd 5470.
according to the evergreen instruction set architecture document a SIMD engine contains 16 thread processors with five stream cores each, totaling 80 stream cores per SIMD engine.
As the HD5470 has 80 stream cores, that would translate into 1 SIMD engine.
But if I query the device capabilities by using
clGetDeviceInfo(device_id,CL_DEVICE_MAX_COMPUTE_UNITS,sizeof(di),&di,NULL);
the result is 2 compute units which would mean that each SIMD engine has 8 thread processors with 5 stream cores each.
Can anyone clarify how many compute units (SIMD engines) with how many thread processors the HD 5470 has?
The smaller GPUs have smaller SIMDs; some of the earlier parts with 40 ALUs had 2 4-way SIMDs.
AFAIK your understanding of 5470 configuration is correct - 2 SIMDs, each performing 5-ALU operations on 8 data elements in parallel.
On the HD6xxx generation the smallest part has 160 ALUs, configured as 2 16-wide SIMDs.