ibird

__local oveflow ? ...

Discussion created by ibird on Sep 11, 2010
Latest reply on Sep 17, 2010 by MicahVillmow

Writing a kernel i have founded a behaviour i do not understand, and i has tried to isolate an example code.

 

This code, do not do nothing,  put it into template example

/*!
 * Sample kernel which multiplies every element of the input array with
 * a constant and stores it at the corresponding output array
 */


__kernel void templateKernel(__global  unsigned int * output,
                             __global  unsigned int * input,
                             const     unsigned int multiplier)
{
    unsigned int idx = get_global_id(0);
 
    __local float bsh;
    __local float norm;
    
    bsh = 5;
    barrier(CLK_LOCAL_MEM_FENCE);
    
    float out = 7.0;
    norm = out;
    output[idx] = bsh;
}

 

Now bsh = 5 and a CLK_LOCAL_MEM_FENCE ensure that all threads are syncronized, after this bsh is never changed, so i exect that output[idx] is always 5.

Unfortunately the example print   ... 7, if i change out to 8, print 8 and so on ...

 

I am doing something wrong ? or there is something i has not understanded ?

 

Outcomes