berathebrain Jan 10, 2010 1:14 PM (in response to berathebrain)Actually, if the length of input data is somewhat large, it also fails for simple float. I have just changed the kernels and everything in Reduction.cpp from uint4 to float and sometimes I get good results and sometimes bad.
Does anyone know where is the problem?

redditisgreat Jan 11, 2010 8:30 PM (in response to berathebrain)It's obviously rounding errors. If the CPU accumulates results in the 80 bit floating point registers you get less rounding errorst than if you do it with 32 bit variables in the GPU or the SSE unit.

Berathebrain,
Problem is in floating point arithmetic. Associative property is not a valid property for floating point addition and multiplication.
EX :
1. ( a + b ) + c is not equal to a + ( b + c )
2. ( a * b ) * c is not equal to a * ( b * c )
