Matrix add problem

Discussion created by rolandman99 on Dec 19, 2010
I tried matrix addition (h=1024 , w=1024). The global work size:  {h/4, w/4}. I use 2 dimensional NDRange.


The kernel code:

__kernel void add(__global *float4 c, __global float4 a, __global float4 b, int h, int w)


    int i = get_global_id(0);

    int j = get_global_id(1);

    w = w/4;

    c[i*w+j] = a[i*w+j] + b[i*w+j];


The problem is, the result is not correct. It did not add all the elements in the matrix. Can someone point out what's wrong with the kernel code?.