Your GPU has 10 SIMD units which are actually 'Compute Units' in terms of OpenCL. Each SIMD has 16 TPs(Thread processors) which are 'Processing elemens' in OCL.
A TP has 5 execution units(ALUs) which can execute 5 different instructions in 1 cycle but on a single thread. The shader compiler is responsible to find 5 independent instructins from kernel//shader and pack them in a Very Long Instruction Word (VLIW) and all the thread processors execute this VLIW instruction group in 1 cycle.
A 'wavefront' is equivalent to a 'warp'. It consists of 64 threads and threads are executed in order over 4 cycles on the 16 TPs i.e 0-15 for 1 cycle, 16-31 for cycle 2, 32-47 for cycle 3 and 48-63 on the 4th cycle. (There are actually 2 wavefronts which are executed alternately : 16 threads of wavefront1 are executed on 1st cycle then 16 threads of wavefront2 are executed on cycle2 and so on.)
I hope that answers your questions. You can read the Stream User guide for more informationon ATI gpus : developer.amd.com/gpu_assets/Stream_Computing_User_Guide.pdf