Figured out how to query extensions of the device.
vendor: Advanced Micro Devices, Inc.
version: OpenCL 1.1 ATI-Stream-v2.3 (451)
extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_printf cl_amd_media_ops cl_amd_popcnt
No 64 bit atomics
No. Atomics on the AMD architectures are executed differently from competing architectures. There are hardware atomic units on the memory interfaces. These allow very fast asynchronous atomics meaning that atomic operations are very low overhead on these architectures, unfortunately it also means that more complex atomics are not possible as they would be if they were based on locking cache lines and full return trips through the main ALUs. There are no floating point atomics for the same reason.
it should be possible implement add/sub atomic with two 32 bit numbers.
old = atom_add(a, x);
if(old+x == 0)atom_inc(a);
but i am not sure if it is correct.
It's not. Overall it's not an atomic operation because something could change in both 32-bit words before you do the second operation.
You'd have to do it by using a compare and set on a lock word.
CAS lock, if success change 64-bit value and then unlock. if not retry.