AnsweredAssumed Answered

Floating point operations difference between CPU and GPU

Question asked by kahlan on Nov 15, 2012
Latest reply on Aug 8, 2013 by ash

Hallo,

I have an OpenCL kernel that implements a dot product between two float arrays. The first is an array of size*n elements and the second is an array of n elements.

This is a sample code

 

    _kernel

    void evaluate_product(__global const float *pFirstArray,

                          const int n,

                          __global const float *pSecondArray,

                          __global float *pOutput)

{

int gid = get_global_id(o);

 

int size = get_global_size(o); 

 

         if (gid>=0 && gid<size)

        {

          float output = 0;

          for (int k=0; k<n; k++)

          {

            output += pFirstArray[gid + k*size]*pSecondArray[k]; 

          }

 

          pOutput[gid] = output;

 

        }

}

 

 

If I execute the same operations on CPU, I have different results, above all after 6 or 7 decimal digit. Why this strange behaviour? In kronos OpenCL specification (v 1.2) they say the x+y and x*y are correctly rounded as well as IEEE 754 compliant.

Any ideas?

Outcomes