Writing a kernel i have founded a behaviour i do not understand, and i has tried to isolate an example code.
This code, do not do nothing, put it into template example
/*!
* Sample kernel which multiplies every element of the input array with
* a constant and stores it at the corresponding output array
*/
__kernel void templateKernel(__global unsigned int * output,
__global unsigned int * input,
const unsigned int multiplier)
{
unsigned int idx = get_global_id(0);
__local float bsh;
__local float norm;
bsh = 5;
barrier(CLK_LOCAL_MEM_FENCE);
float out = 7.0;
norm = out;
output[idx] = bsh;
}
Now bsh = 5 and a CLK_LOCAL_MEM_FENCE ensure that all threads are syncronized, after this bsh is never changed, so i exect that output[idx] is always 5.
Unfortunately the example print ... 7, if i change out to 8, print 8 and so on ...
I am doing something wrong ? or there is something i has not understanded ?
You are right, it is a strange behavior. Seems like there is something wrong with single __local variables. However, if I changed __local float to an array, everything works as expected.
__kernel void templateKernel(__global unsigned int * output, __global unsigned int * input, const unsigned int multiplier) { unsigned int idx = get_global_id(0); __local float bsh[1]; __local float norm[1]; bsh[0] = 5; barrier(CLK_LOCAL_MEM_FENCE); float out = 7.0; norm[0] = out; output[idx] = bsh[0]; }
So i must use __local float[1] ?
Is normal or there is something i do not understand ?
I forgotten
Platform: linux Fedora Core 12, g++/gcc 4.5.0
it most likely bug and use array is workaround. AMD developer should look at this.
Variables are not supported in local memory
Only __local buffers can be generated.
you can see that single variable are supported. http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/local.html
I have reported this some time ago, but there was some confusion about single variables in the OpenCL 1.0 spec.
I would expect this to get fixed with OpenCL 1.1 being more clear on this issue.