cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

berathebrain
Journeyman III

Reduction kernel sample not working with float4

I have tried to change the reduction kernel sample to work with float4 numbers. It turns out that the kernel produces some minor errors when adding numbers. The reference CPU reduction is different from OpenCL reduction. It differs by 0.01% but if I have like a million values then the error is pretty huge.

Can someone confirm this, or am I doing something wrong.

With float works but with float4 it doesn't;

Tried the CPU context and GPU.

I have Radeon 4870, Windows 7 64bit.

0 Likes
3 Replies
berathebrain
Journeyman III

Actually, if the length of input data is somewhat large, it also fails for simple float. I have just changed the kernels and everything in Reduction.cpp from uint4 to float and sometimes I get good results and sometimes bad.

Does anyone know where is the problem?

0 Likes

It's obviously rounding errors.  If the CPU accumulates results in the 80 bit floating point registers you get less rounding errorst than if you do it with 32 bit variables in the GPU or the SSE unit.

0 Likes

Originally posted by: berathebrain Actually, if the length of input data is somewhat large, it also fails for simple float. I have just changed the kernels and everything in Reduction.cpp from uint4 to float and sometimes I get good results and sometimes bad.

 

Does anyone know where is the problem?

 

Berathebrain,

       Problem is in floating point arithmetic.  Associative property is not a valid property for floating point addition and multiplication.

      EX :

        1.  (  a + b ) + c  is not equal to  a + ( b + c )

        2.  (  a * b ) * c  is not equal to  a * ( b * c )

 

0 Likes