I have a kernel that uses local memory and at first I declared a local array of doubles inside the function body. The resulting calculations of that kernel turned out be to wrong.
Each work-item would copy 1 value to the array after which I had a barrier to syncronize. Efter the synchronization I checked the content of the local array and discovered it was corrupt.
The I supplied the local array as part of the function header which solved the problem. The content of the local array was correct now.
Why is it that my first attempt of using local array fails while the second is ok? In the first situation, is the local array not shared between all work-items?