__local oveflow ? ...

Discussion created by ibird on Sep 11, 2010
Latest reply on Sep 17, 2010 by MicahVillmow

Writing a kernel i have founded a behaviour i do not understand, and i has tried to isolate an example code.


This code, do not do nothing,  put it into template example

 * Sample kernel which multiplies every element of the input array with
 * a constant and stores it at the corresponding output array

__kernel void templateKernel(__global  unsigned int * output,
                             __global  unsigned int * input,
                             const     unsigned int multiplier)
    unsigned int idx = get_global_id(0);
    __local float bsh;
    __local float norm;
    bsh = 5;
    float out = 7.0;
    norm = out;
    output[idx] = bsh;


Now bsh = 5 and a CLK_LOCAL_MEM_FENCE ensure that all threads are syncronized, after this bsh is never changed, so i exect that output[idx] is always 5.

Unfortunately the example print   ... 7, if i change out to 8, print 8 and so on ...


I am doing something wrong ? or there is something i has not understanded ?