In the cuda manual, attomic add can be implmented using compare and swap like this:
__device__ double atomicAdd(double* address, double val)
{
unsigned long long int* address_as_ull =
(unsigned long long int*)address;
unsigned long long int old = *address_as_ull, assumed;
do {
assumed = old;
old = atomicCAS(address_as_ull, assumed,
__double_as_longlong(val +
Hi Micah, thank you for your reply.
I was trying this code on a Nvidia Fermi card C2050.
I can use atomicdoubleadd from cuda on this card, so I guess it supports 64-bit atomic.
However, I cannot build the program successfully due to this function, so could you please me to see what's wrong with my code?
double atomicDoubleAdd(__global double* address, double val) { ulong * address_as_ull =(ulong *)address; ulong old = *address_as_ull, assumed; do { assumed = old; old = atom_cmpxchg(address_as_ull, assumed,as_ulong(val +as_double(assumed))); } while (assumed != old); return as_double(old); }
Anyway, thanks for your reply.
So will there be support for double precision atomic operation in the near furture?