    OpenCL on CPU


      when we run an openCL program on the CPU, do the kernels run parallelly? 

      If yes, then what is the special advantage of using the GPU?

          with most powerful intel core i7 965 you can get around 70 GFLOPS and 50GB/s L1 cache speed. to ram it is 13GB/s

          Radeon 5870 have 2720 GLOPS and 150GB/s from VRAM.

          so with suitable task you can get significant speed up on GPU.