I'm currently playing with some ideas with APU (Carrizo) machine. I'm using OpenCL 2.0 and fine-grained SVM buffer.
According to synchronization for fine-grained SVM, I read that values are synced at synchronization points like clFinish(), clWaitForEvents() or with using atomic operations. However, since in APU, CPU and GPU share memory space (zero-copy), I wanted to see whether updating SVM variable in CPU side will be visible to GPU side without any synchronization points, and vice versa. So I allocated a variable as fine-grained SVM (without atomics flag) and set it as GPU kernel's parameter. GPU kernel will run persistently inside a while loop and check whether the value changes (0 -> 1). Once it sees the change, the kernel will then try to change the value again (1 -> 2). However, after the test, I found that the change of value is never visible. The change was only seen if I used an atomic operation with atomic variable, or the kernel finished instead of persistently running.
This result is little hard to understand for me because CPU and GPU are supposed to share memory in APU. Can anyone explain why we need a synchronization point for SVM in APU machine? If GPU is seeing a different value from same variable that CPU updated, where is GPU getting that value? Also, is there a way to make value update visible without requiring either the GPU kernel to complete (I want the kernel to keep running within a while loop) or use atomic variables?