cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

himanshu_gautam
Grandmaster

why atomics only for integer types??

I am interested if someone can tell why atomic extension are being introduced only for integer types (32bit and 64-bit).

Are there any specific constraints in implementing them for other data types? This seems to be really helpful for some of the problems i have been working recently.

Also Do NVIDIA support atomics for float/double?

Thanks

Himanshu

0 Likes
1 Solution
AlexV
Adept I

There is a  (not necessarily negligible) cost attached to FP atomic units. Also, if you consider the typical use for atomics (building blocks for sync primitives), it is unclear where FP atomics would fit in (granted there are some other neat uses). NVIDIA decided the cost to have FP atomic units was worth it in the context of the benefits afforded, so they can do general atomics on FP types (everybody can do exchange). Note that using cmpxchg one can actually implement most of everythiust that it's ther integer or float types, it's just that it's kludgy and hardly perf optimal.

View solution in original post

0 Likes
2 Replies
AlexV
Adept I

There is a  (not necessarily negligible) cost attached to FP atomic units. Also, if you consider the typical use for atomics (building blocks for sync primitives), it is unclear where FP atomics would fit in (granted there are some other neat uses). NVIDIA decided the cost to have FP atomic units was worth it in the context of the benefits afforded, so they can do general atomics on FP types (everybody can do exchange). Note that using cmpxchg one can actually implement most of everythiust that it's ther integer or float types, it's just that it's kludgy and hardly perf optimal.

0 Likes
realhet
Miniboss

On the GCN architecture we have:

ds_cmpst_f32 (compate+swap), ds_min_f32, ds_max_f32  for Local/Global Data Share.

There are variants of those when the previous value is returned: ds_cmpst_rtn_f32, ds_min_rtn_f32, ds_max_rtn_f32

Also all of those have _f64 versions.

For memory there are some float32 atomics:

buffer_atomic_fcmpswap, buffer_atomic_fmax, image_atomic_fmin

It's possible to return previous values, and you can schedule 2x atomic operations at once: buffer_atomic_fcmpswap_x2, buffer_atomic_fmax_x2

Feel free to check http://developer.amd.com.php53-23.ord1-1.websitetestlink.com/wordpress/media/2012/10/AMD_Southern_Is... for details.

0 Likes