1 Reply Latest reply on Mar 3, 2013 11:44 PM by himanshu.gautam

    Simplifying and Optimizing a Recursive Sort Algorithm for OpenCL?


      I have a recursive O(n log(n)) (I think) sorting algorithm that should work decently if parallelized. Here is the code using C++ code.

      // Groups the array into two parts: low and high.
      for(int i = 0; i < LENGTH/2; i++)
          auto &objectA = vec[i];
          auto &objectB = vec[(LENGTH+1)/2+i];
          if(objectB < objectA)
              std::swap(objectA, objectB);
      // Exposes the highest half of low and lowest half of high.
      // The next two statements can be done concurrently
      std::partial_sort (vec.begin() + LENGTH/2, vec.end() - LENGTH/4, vec.end());
      std::partial_sort (vec.rbegin() + LENGTH/2, vec.rend() - LENGTH/4, vec.rend(),
          [](int a, int b){ return b<a; });
      /* Simplified from the these two sort calls.
       * Sorts the exposed halves. Makes low and high only have the lowest and highest entries, respectively.
       *  std::sort (vec.begin() + LENGTH/4, vec.end() - LENGTH/4);
       * Sorts the low half of the array.
       *  std::sort (vec.begin(), vec.begin() + LENGTH/2);
      std::sort (vec.begin(), vec.end() - LENGTH/4);
      // Sort the high half of the array.
      std::sort (vec.end() - LENGTH/2, vec.end());


      There are four recursive calls so I'm getting a headache by looking at it. I don't have  experience simplifying very complex algorithms like. After dealing with the recursion, my main concern is what to do when the arrays get small. Sorting those small arrays could take a serious chunk out of performance.


      Any ideas for the algorithm? Any ideas on how I should make this more comprehensible for myself?


      EDIT: When replacing the std::sorts with my own implementation, I realized I used a sum instead of a product to calculate the big O.


      Message was edited by: Kyle Siefring