1 Reply Latest reply on Feb 19, 2012 7:14 PM by LeeHowes

    About the running-time of Bitonic Sort in SDK samples


      Hi everyone,


        Recently, I started to analysing the execution time of different algorithms on GPGPU. But I found the results of sample "BitonicSort" in SDK are much larger than the results I saw from some papers and documents. For instance, the kernel time of sorting a array with lenth 8388608 is 27s on GPU, and 46s on CPU(speedup is just approximately 2), but only 8s on STL. I was so confused that paralell sorting on GPU is such slower than STL. Could anyone tell me why?  Bty, I also don't understand why the complexity of this sample, which is mentioned in the sample's doc, is O( N * log2(N) * log2(N) ), not ( log2(N) * log2(N) ). And I'll be very happy if someone can provide me a better design of Bitonic Sort.

         My computer is a laptop with Intel Core i7 and ATI Mobility Radeon HD 5870.