cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

chevydevil
Adept II

atom_add for float

Hello. I need to sum up some forces for a deformable body mass-spring system with volume preservation. Every workitem should add part of a force for its involved masspoints(vertices). I thought i can do this with an atomic add but in the 1.0 spec only integer and long types are allowed. So is there a way to do this or do i need another kernel for synchronisation?

Edit: Some Masspoints(vertices) are be shared by different workitems. Thats the whole thing.

Edit2: Here some source code example:

This doesn't work either. It seems i really can use only __global int or _global long for atomic operations.

#pragma OPENCL EXTENSION cl_khr_fp64 : enable #pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable typedef struct { float4 mpMass; float4 rl1, rl2; float m_v0, a, b, c; int4 mps; } tetra; __kernel void force_calc(__global int4* _force, __global tetra* _tetra) { int id = get_global_id(0); __global tetra *t = &_tetra[id]; int mpid = t->mps.x; __global int4 *f = &_force[mpid]; (void) atom_add(f->x,10); }

0 Likes
7 Replies
chevydevil
Adept II

Me again. I overlooked that there is a global mem_fence. So I think this should work for me(But feel free to correct me 🙂 😞

Edit: I correct myself. The mem_fences are only defined within workgroups. I'm open for ideas now! 😉

#pragma OPENCL EXTENSION cl_khr_fp64 : enable #pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable typedef struct { float4 mpMass; float4 rl1, rl2; float m_v0, a, b, c; int4 mps; } tetra; typedef struct { float4 mpPos, m_x0, m_f_ext, m_f_con, m_f_pen, velocity; int4 test; } masspoint; __kernel void force_calc(__global masspoint* _massPoints, __global tetra* _tetra) { int id = get_global_id(0); __global tetra *t = &_tetra[id]; int mpid = t->mps.x; __global masspoint *f = &_massPoints[mpid]; mem_fence(CLK_GLOBAL_MEM_FENCE); f->m_f_ext.x += 10; mem_fence(CLK_GLOBAL_MEM_FENCE); }

0 Likes

add fence or barrier did not make this add atomics as it synchronize only acrosss one workgroup.

0 Likes

You could use atomic add for integers, if you make the force at each masspoint a fixed precision integer. e.g. define the force as having 3 places after the decimal. Simply multiply the force by 1000 and then convert to integer, before doing the atomic add.

When you read the force convert it to float and divide by 1000.

If you need lots of range, then you could try a 64 bit long instead.

0 Likes

Thanks for the replies. I thought of this two and will work with the integers or longs now. Another problem is: I it seems that i can't use for example int4 because the atom_add can't be done with a vectors component. This error message says so:
P.S _massPoints[mpid1] is in that case a __global *int4

ine 25: error: bad argument type to opencl atom op: expected pointer to int/uint with addrSpace global/local (void) atom_add(_massPoints[mpid1].x,10;

0 Likes

This is the code I use for float AtomicAdd():

void AtomicAdd(__global float *val, const float delta) {
    union {
        float f;
        unsigned int i;
    } oldVal;
    union {
        float f;
        unsigned int i;
    } newVal;

    do {
        oldVal.f = *val;
        newVal.f = oldVal.f + delta;
    } while (atom_cmpxchg((__global unsigned int *)val, oldVal.i, newVal.i) != oldVal.i);
}

 

0 Likes

Isn't it necessary to declare pointers as volatile before using atomics?

0 Likes

For my problem I stepped back from using atom operations. They are to slow especially if one has to use 12 of them in every thread. I will use a RGB Texture and an extra kernel for summing up the columns now.

0 Likes