How does vector type increase throughput in gpu?

Discussion created by krrishnarraj on Jul 17, 2011
Latest reply on Jul 17, 2011 by LeeHowes

Am new to opencl and was used to cuda and nvidia gpus.

(Excuse me for using cuda terms here)

I thought a warp(32 threads) goes to 8 SPs( 4 threads to each SP ) in an SM

I was going through online examples given by AMD:

it says using vectors in openCL increases throughput in GPU. now is it like 1 thread goes to 1 sp instead of 4 threads?

Can someone explain how does it improve performance in the hardware level.