Sorry for this delayed reply.
Please refer to Section 5.6.1 OCL 2.0 language specification( SVM sharing granularity):
Fine-grained sharing: Shared virtual memory where memory consistency is maintained at a granularity smaller than a buffer. How fine-grained SVM is used depends on whether the device supports SVM atomic operations.
o If SVM atomic operations are supported, they provide memory consistency for loads and stores by the host and kernels executing on devices supporting SVM. This means that the host and devices can concurrently read and update the same memory. The consistency provided by SVM atomics is in addition to the consistency provided at synchronization points. There is no need for explicit calls to clEnqueueSVMMap and clEnqueueSVMUnmap or clEnqueueMapBuffer and clEnqueueUnmapMemObject on a cl_mem buffer object created using the SVM pointer.
And 6.13.11 OpenCL C language specification (Atomic Functions):
In particular, when a host thread needs fine control over the consistency of memory that is shared with one or more OpenCL devices, it must use atomic and fence operations that are compatible with the C11 atomic operations
This flag is valid only if
CL_MEM_SVM_FINE_GRAIN_BUFFERis specified in
flags. It is used to indicate that SVM atomic operations can control visibility of memory accesses in this SVM buffer.
In summary, it’s equivalent to the C11 model, so you have to do acquire/release to make memory operations visible. On Carrizo, SVM atomics are supported, so the spec requires to use it for fine-grain consistency which is exactly how it should behave as reported.