I have a sequential function code below. I want to make it parallel as OpenCL kernel, but I think this can not be parallelize. Am I right? CMIIW. Thanks for help.
... float sum = 0; for(i=0; i<h;i++) for(j=0; j<w;j++) sum+= array[i*w+j]; return sum;
The code is parallelizeable quite easily.Refer to the Reduction samples from the SDK.