cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Iska91
Journeyman III

OpenCL inaccurate

Hey everyone,

I finally got OpenCL up and running so I started my first tests. I calculated Pi using the taylorserie of arctan(1) (= 1/4 Pi) but I got an error in the number calculated by the GPU while they are using the same code!
int main:


cout << setprecision(10) << setiosflags(ios::fixed) << setiosflags(ios::showpoint);

cout << "Calculating Pi using GPU (1024 serie)\n\n";

// Initialize Host application
if(initializeHost()==1)
return 1;

// Initialize OpenCL resources
if(initializeCL()==1)
return 1;

int iTime = 0;

iTime = runCLKernels();
// Run the CL program
if(iTime==1)
return 1;

float fPi = 0.0f;
for(cl_uint i = 0; i < width; i++)
{
 fPi += output[ i ];
}

cout << "Pi is :" << fPi*4.0f << "\n";
cout << "Time total :" << iTime << "\n\n\n";
system("pause");

// Releases OpenCL resources
if(cleanupCL()==1)
return 1;

// Release host resources
cleanupHost();

cout << "Calculating Pi using CPU (1024 serie)\n\n";

iStart = GetTickCount();

fPi = 0.0f;
int iCount = 0;
int iTotal = 256 * 10000;
while (iCount < iTotal)
{
fPi += pow(-1.0, iCount) / (2 * iCount + 1);

iCount++;
}

iEnd = GetTickCount();

cout << "Pi is :" << fPi*4.0f << "\n";
cout << "Time total :" << iEnd - iStart << "\n";

system("pause");

return 0;


Kernel

 __kernel void MainKernel(__global float * output)
{
uint xid = get_global_id(0);

uint iCount = xid * 10000;
float fTemp = 0.0f;
while (iCount < (xid * 10000 + 10000))
{
fTemp += pow((-1), iCount) / (2 * iCount + 1);

iCount++;
}
output[xid] = fTemp;
}


Output:

Calculating Pi using GPU (1024 serie)

Pi is: 3.1415958405
Time total: 62

 

Calculating Pi using CPU (1024 serie)

Pi is: 3.1415963173
Time total: 406

Does anyone know what causes this error? It becomes bigger using when I make the serie longer.

 

Regards,
Iska



0 Likes
10 Replies
hazeman
Adept II

FPU on x86 architecture internaly uses 80 bit arithmetic. GPU on the other hand is using all the time 32bits. That is most probably the cause of your problem.

To improve accuracy use double or more advanced techniques like quad float or double-double ( google QD (C++/Fortran-90 double-double and quad-double package) ).

0 Likes

I found that a float has a pretty low accuracy, but OpenCL (on ATI 4850) fails to build when I'm using doubles. Is there a way to fix this?

0 Likes

Double is not supported currently

0 Likes

I noticed

But that means that I'm screwed and with that a whole lot of scientific applications that require a high accuracy?

0 Likes

The problem of GPU and CPU not getting the same results may not be precision (and BTW, CPU may not be using the 80-bit precision intermediary results) but the uggly fact that, when using floats or doubles, multiplys and adds are not comutatives, I mean, a + b + c may differ from c + b + a, when comparing results the algorithm is a bit different, in CPU you sum from first to last, in GPU you sum in blocks to then sum the results of the blocks, this little difference when done with floats is enough to yield different results.

 

Also, if you are trying to compute a lot of pi digits double precision will not be enough either and you will probably not like Taylor series anymore

 

0 Likes

58xx has beta double support for simple +-*/ operation even i do not sucsefuly use it.

0 Likes

if you specify the cl_khr_fp64 extension, you will get experimental support for the basic math operations for doubles along with I/O.
0 Likes

Thank you for your replay MicahVillmow,

My code is like this ATM (kernel):

#pragma OPENCL EXTENSION cl_khr_fp64 : enable

__kernel void MainKernel(__global double * output)
{
   uint xid = get_global_id(0);

   uint iCount = xid * 10000;
   double dTemp = 0.0;
   while (iCount < (xid * 10000 + 10000))
   {
     dTemp += pow((-1.0), iCount) / (2 * iCount + 1);

     iCount++;
   }
   output[xid] = dTemp;
}

But i got the following error:
C:\Users\Jasper\AppData\Local\Temp\OCL7D6A.tml.obj:fake: (.text+0xa4): undefined reference to '__pow_f64' 
C:\Users\Jasper\AppData\Local\Temp\OCL7D6A.tml.obj:fake: (.text+0x184): undefined reference to '__pow_f64' 
Error: Building Program (clBuildProgram)

Do you know what causes this error?

Thanks in advance

0 Likes

Iska,

Double is only available as a preview feature. Only addition, subtraction and multiplication operators are available for use presently. 

You can expect full support in a future release. 

Please refer to KB article, http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=88 for detailed info.

0 Likes

Okay, got it

Looking forward to the next release

 

Edit:
It works

#pragma OPENCL EXTENSION cl_khr_fp64 : enable

__kernel void MainKernel(__global double * output)
{
uint xid = get_global_id(0);

uint iCount = xid * 10000;
double dTemp = 0.0;
bool bFlag = false;
while (iCount < (xid * 10000 + 10000))
{
if (bFlag)
{
dTemp += (double) -1 / (2 * iCount + 1);
bFlag = false;
}
else
{
dTemp += (double) 1 / (2 * iCount + 1);
bFlag = true;
}

iCount++;
}
output[xid] = dTemp;
}


Ty very much guys! U saved me

0 Likes