Sheeep

Different Results using GPU and CPU

Discussion created by Sheeep on Jan 30, 2010
Latest reply on Feb 10, 2010 by Sheeep

I had some problems with large code. I don't get the same results, if I change context from gpu to cpu.

So I tried a little kernel, but i get the same problem.

 

If I don't use the global_id, i get the correct result using gpu. But if I use the cpu I get a result, that is much bigger than the correct one.

__kernel void test1(__global float *a, __global float *b, __global float *c){
    int gis=get_global_size(0);
    for(int j=0;j
        for(int i=0;i<100;i++){
            c[j]+=a[j]+b[j];   
        }
    }
}

 

then i tried this:

__kernel void test1(__global float *a, __global float *b, __global float *c){
    int gid=get_global_id(0);
    for(int i=0;i<100;i++){
            c[gid]+=a[gid]+b[gid];   
    }
}

and it works on cpu and gpu.

Does anyone know why?

Outcomes