AnsweredAssumed Answered

dot product reduction

Question asked by lightjohn on Jan 7, 2016
Latest reply on Jan 11, 2016 by Meteorhead

Hello, newbie here.


I want to do a  dot product like in this tuto:…


My operation is more like that:


float reduce_sum(double* inputA, double* inputB , int length) {
  float accumulator = 0.0;
  for(int i = 0; i < length; i++) 
  accumulator += inputA[i]*inputB[i];
  return accumulator;


But the tutorial is far too complex for now, and here my current kernel:


#pragma OPENCL EXTENSION cl_khr_int64_base_atomics: enable

void MyAtomicAdd(__global double *val, double delta) {
    union {double f;ulong  i;} old;
    union {double f;ulong  i;} new;
    do {
      old.f = *val;
      new.f = old.f + delta;
    } while (atom_cmpxchg ( (volatile __global ulong *)val, old.i, new.i) != old.i);

__kernel void inner(__global double *out, __global double *x, __global double *p)
    int gid = get_global_id(0);
    MyAtomicAdd(out, x[gid]*p[gid]);


It's working but it's very very very slow (2second). So I thought I could use:


atom_add((volatile __global ulong *)out, (ulong) (x[gid]*p[gid])) instead of MyAtomicAdd but the output is not good so I think even if ulong and double have the same size it's not enough.Also if someone could tell me how to make an atomic add with double, I would be very grateful because google is not very helping in this case.


Can you help me ?

Thank you for any advice.