lupescu_grigore

OpenCL mem_fence

Discussion created by lupescu_grigore on Mar 27, 2011
Latest reply on Mar 28, 2011 by lupescu_grigore
Given the following kernel :
__kernel void testKernel(__global float* input,__global float* output,int nrElements)
{
      uint x = get_global_id(0);
    if(x<(nrElements-1))
      output[x]=input[x+1];
    else
      output[x]=input[0];
    
    mem_fence(CLK_LOCAL_MEM_FENCE);
    output[x]++;
}
And the input 0 1 0 1 0 1 0 1 0 1
One would expect 2 1 2 1 2 1 2 1 as output ( shift + add 1)
Results are :
No mem_fence
CPU Core i3 OK
GPU 5470 OK
With mem_fence
CPU Core i3 OK
GPU 5470 ERROR - values got { 1 2 3 4 5 } ... 2 1 2 1 2 ... 3 2 3 2 .. 5 4 5 4...
Inputs need not be very long. 100-1000 will do.
Am i missing something ?

Outcomes