kernel creation help

Discussion created by Chocrates on Jun 3, 2010
Latest reply on Jul 19, 2010 by psath

I'm trying to figure out how to port some sequential code to a kernel so i can test the speedup, but im having a lot of trouble.  None of the tutorials ive found help with this aspect, only the setting up the kernel to run and recieve data.  So far i have this for multiplying polynomials together.


int *mult(int size, int *a, int *b)
    int *out, i, j;
    out = (int *)malloc(sizeof(int) * size * 2 - 1);
    for(i = 0; i < size; i++){
        for(j = 0; j < size; j++){
            out[i + j] = (a * b[j]) + out[i + j];
    return out;


Naive Kernel:


#pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable

__kernel void poly_mul( __global int *a, __global int *b, __global int *c)
    size_t i = get_global_id(0);
    size_t j = get_global_id(1);
        c[i + j] = (a * b[j]) + c[i + j];   

this doesnt return the correct result, i think this is because it of the c[i + j] on the end.  Anyone have any advice or a link to a good tutorial on this aspect of opencl?