Is this code for the device?
Is this issue fixed.
If not can you give more details.
Please look at memory consistency section in OpenCL Spec
Let me paste it from OpenCL 1.2 spec for your reference:
Within a work-item memory has load / store consistency. Local memory is consistent across
work-items in a single work-group at a work-group barrier. Global memory is consistent across
work-items in a single work-group at a work-group barrier, but there are no guarantees of
memory consistency between different work-groups executing a kernel.
I hope this answers your question