7 Replies Latest reply on Jun 24, 2010 4:07 PM by chevydevil

    atom_add for float

    chevydevil

      Hello. I need to sum up some forces for a deformable body mass-spring system with volume preservation. Every workitem should add part of a force for its involved masspoints(vertices). I thought i can do this with an atomic add but in the 1.0 spec only integer and long types are allowed. So is there a way to do this or do i need another kernel for synchronisation?

      Edit: Some Masspoints(vertices) are be shared by different workitems. Thats the whole thing.

      Edit2: Here some source code example:

      This doesn't work either. It seems i really can use only __global int or _global long for atomic operations.

      #pragma OPENCL EXTENSION cl_khr_fp64 : enable #pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable typedef struct { float4 mpMass; float4 rl1, rl2; float m_v0, a, b, c; int4 mps; } tetra; __kernel void force_calc(__global int4* _force, __global tetra* _tetra) { int id = get_global_id(0); __global tetra *t = &_tetra[id]; int mpid = t->mps.x; __global int4 *f = &_force[mpid]; (void) atom_add(f->x,10); }

        • atom_add for float
          chevydevil

          Me again. I overlooked that there is a global mem_fence. So I think this should work for me(But feel free to correct me :-) ):

          Edit: I correct myself. The mem_fences are only defined within workgroups. I'm open for ideas now! ;-)

          #pragma OPENCL EXTENSION cl_khr_fp64 : enable #pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable typedef struct { float4 mpMass; float4 rl1, rl2; float m_v0, a, b, c; int4 mps; } tetra; typedef struct { float4 mpPos, m_x0, m_f_ext, m_f_con, m_f_pen, velocity; int4 test; } masspoint; __kernel void force_calc(__global masspoint* _massPoints, __global tetra* _tetra) { int id = get_global_id(0); __global tetra *t = &_tetra[id]; int mpid = t->mps.x; __global masspoint *f = &_massPoints[mpid]; mem_fence(CLK_GLOBAL_MEM_FENCE); f->m_f_ext.x += 10; mem_fence(CLK_GLOBAL_MEM_FENCE); }

            • atom_add for float
              nou

              add fence or barrier did not make this add atomics as it synchronize only acrosss one workgroup.

              • Fixed Precision
                Jawed

                You could use atomic add for integers, if you make the force at each masspoint a fixed precision integer. e.g. define the force as having 3 places after the decimal. Simply multiply the force by 1000 and then convert to integer, before doing the atomic add.

                When you read the force convert it to float and divide by 1000.

                If you need lots of range, then you could try a 64 bit long instead.

                  • atom_add for float
                    chevydevil

                    Thanks for the replies. I thought of this two and will work with the integers or longs now. Another problem is: I it seems that i can't use for example int4 because the atom_add can't be done with a vectors component. This error message says so:
                    P.S _massPoints[mpid1] is in that case a __global *int4

                    ine 25: error: bad argument type to opencl atom op: expected pointer to int/uint with addrSpace global/local (void) atom_add(_massPoints[mpid1].x,10;

                      • atom_add for float
                        davibu

                        This is the code I use for float AtomicAdd():

                        void AtomicAdd(__global float *val, const float delta) {
                            union {
                                float f;
                                unsigned int i;
                            } oldVal;
                            union {
                                float f;
                                unsigned int i;
                            } newVal;

                            do {
                                oldVal.f = *val;
                                newVal.f = oldVal.f + delta;
                            } while (atom_cmpxchg((__global unsigned int *)val, oldVal.i, newVal.i) != oldVal.i);
                        }