cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

musale87
Adept I

Maths in double precision gives me different results on AMD and NVIDIA

Hi everyone,

I made a porting a Cellular Automaton from CPU to GPU, I'm using OpenCL and I have the following hardware:

CPU Xeon 8 core

GPU NVIDIA k20c

GPU AMD R9 280x

The porting is completed but there is a problem with the R9 280x card.

The problem is that the result are very different from the CPU's and NVIDIA's (this two are very close).

I have made a simple kernel to show you the different results:


#pragma OPENCL EXTENSION cl_khr_fp64 : enable


#define DIVISOR 1.0002349474678



__kernel void div_kernel( __global double *in){


    in[0]+=pow(DIVISOR,10);


    barrier(CLK_GLOBAL_MEM_FENCE);


}



__kernel void div_kernel2( __global double *in){


    in[0]/=pow(DIVISOR,10);


    barrier(CLK_GLOBAL_MEM_FENCE);


}



__kernel void div_kernel3( __global double *in){


    in[0]=in[0]/DIVISOR;


    barrier(CLK_GLOBAL_MEM_FENCE);


}



And those are the results after 1000 execution:

OpenCL on CPU:

div_kernel = 101002.351960

div_kernel2 = 9544.561027

div_kernel3 = 79063.419905

OpenCL on NVIDIA:

div_kernel = 101002.351960

div_kernel2 = 9544.561027

div_kernel3 = 79063.419905

OpenCL on AMD:

div_kernel = 101007.072489

div_kernel2 = 86.949658

div_kernel3 = 79063.419905

as you can see, the division seems ok, but the pow function has rounding error which used in combination with division, amplify the error..

In attachment there is the code if anyone would try it self this kernel (tested on linux and mac).


My question is, why there are this different result from NVIDIA to AMD?

Thanks to all in advance

EDIT:

If anyone have an AMD card and would help me to find a solution, can simply compile and execute the code and then report the results

Thanks!

PS. I'm using Ubuntu 14.04 x86_64 with Catalyst-14.9.

0 Likes
11 Replies