cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

bz7
Journeyman III

Problem with selection sort ( using C#.Net and OpenCL )

Hi everyone

I'm a somewhat noob in programming and completely new to OpenCL !

I wrote some OpenCL code inside a C# Project ( using some online guide ) . its a selection sort but the problem is it won't sort at all ! it returns the number without any order

I really don't know what I did wrong so help me out here

this is the OpenCL code :


__kernel void ParallelSelection(__global int * in1)


  {


  int i = get_global_id(0); // current thread


  int n = get_global_size(0); // input size


  int temp;


  for (int j = i + 1; j < n; j++)


  {


  if (in1 < in1)


  {


  temp = in1;


  in1 = in1;


  in1 = temp;


  }


  }


  }





and this is the C# code :

* vecSum  is the string containing OpenCL code


          //Initializes OpenCL Platforms and Devices and sets everything up


            OpenCLTemplate.CLCalc.InitCL();



            //Compiles the source codes. The source is a string array because the user may want


            //to split the source into many strings.


            OpenCLTemplate.CLCalc.Program.Compile(new string[] { vecSum });



            //Gets host access to the OpenCL floatVectorSum kernel


            OpenCLTemplate.CLCalc.Program.Kernel VectorSum = new OpenCLTemplate.CLCalc.Program.Kernel("ParallelSelection");



            //////////////////////



           //numbers


            int n = 9;



            //Create vectors with 2000 numbers


            int[] v1 = new int[9]{3,14,4,6,-45,4,23,5,1};


       


            ////////////////////////



            //Creates vectors v1 in the device memory


            OpenCLTemplate.CLCalc.Program.Variable varV1 = new OpenCLTemplate.CLCalc.Program.Variable(v1);


       


            ////////////////////////



            //Arguments of VectorSum kernel


            OpenCLTemplate.CLCalc.Program.Variable[] args = new OpenCLTemplate.CLCalc.Program.Variable[] { varV1 };



            //How many workers will there be? We need "n", one for each element


            int[] workers = new int[1] { n };



            //Execute the kernel



            VectorSum.Execute(args, workers);



            ////////////////////////////



            //Read device memory varV1 to host memory v1



            varV1.ReadFromDeviceTo(v1);


            for (int i = 0; i < n; i++)


            {


                listBox1.Items.Add(v1.ToString());



            }





__________________________________

EDIT : Thanks everyone I finally getting to know the SIMD model , for other people like myself interested in a simple sorting algorithm ( parallel of course ) here's the parallel selection sort OpenCL kernel :




__kernel void ParallelSelection(__global int * in1,__global int * out)


                {


                  int i = get_global_id(0); // current thread


                  int n = get_global_size(0); // input size


                    int ikey = in1;


                    int pos = 0;


                  for (int j = 0 ; j < n; j++)


                    {


                                    int jkey = in1;


                                    bool smaller = (jkey < ikey);


                                    pos+= (smaller)?1:0;


                                   


                    }


                            out[pos]=ikey;



                    }


Message was edited by: Behrooz Moradi

0 Likes
1 Solution
void_ptr
Adept I

bz7-

At least one problem is that there is nothing to synchronize the work items. One work item could be writing a value in1 while another one is reading it. Understand that in the SIMD model all the work items are executing in parallel. There's no way of knowing which work items will finish first and so that code is inherently unpredictable.  I think it's also probably wrong to make work id i  (AKA thread i ) responsible for setting all the values from i up to n. You've got multiple work items all trying to change the same entries. As long as you have a work item for every single item in the array to be sorted, why not just let a work item set the value of a single item? Searching "selection sort kernel opencl" turned up a tutorial with some kernel source that probably work. I'd try one of those.

View solution in original post

0 Likes
3 Replies
void_ptr
Adept I

bz7-

At least one problem is that there is nothing to synchronize the work items. One work item could be writing a value in1 while another one is reading it. Understand that in the SIMD model all the work items are executing in parallel. There's no way of knowing which work items will finish first and so that code is inherently unpredictable.  I think it's also probably wrong to make work id i  (AKA thread i ) responsible for setting all the values from i up to n. You've got multiple work items all trying to change the same entries. As long as you have a work item for every single item in the array to be sorted, why not just let a work item set the value of a single item? Searching "selection sort kernel opencl" turned up a tutorial with some kernel source that probably work. I'd try one of those.

0 Likes
himanshu_gautam
Grandmaster

yes, the above code is the serial implementation of selection sort. Try checking out parallel implementation. APP SDK Samples contain a RadixSort Example FYI. Also I am not sure if selection sort can be parallelized.

himanshu_gautam
Grandmaster

Your OpenCL code is ridden with Race Conditions -- which have resulted from incomplete understanding of how to decompose your problem parallel.

Since all workitems proceed from I+1 to N, replacing values in that array here and there --- The workitems are interfering with each other's values and the net result is just confusion.

Technical term for this is "race condition"

As Himanshu mentioned before, Radix Sort is the most parallel friendly one and works well on GPU.

- Bruhaspati