Is there a way to access the fast 64KB Global Data Share (GDS) on Ellesmere (RX 480), either in OpenCL or in the GCN assembly?
cl_ext_atomic_counters_32 has been deprecated, and I had no luck so far with CLRadeonExtender.
Update (1/4/2017 8:14 p.m. PST): It seems that you need the OpenCL 1.2 ABI to enable the GDS functionarity. You can still create OpenCL 1.2 binaries for Ellesmere with Crimson drivers with the "-legacy" build option, and I was able to run a program with the cl_ext_atomic_counters_32 extension and the counter32_t type by adding the "-Dcl_ext_atomic_counters_32" build option. CLRadeonExtender does not seem to be able to handle OpenCL 1.2 binaries for Ellesmere. Further testing is needed.