Archives Discussions

Iska91 · ‎01-12-2010

Hey everyone,

I finally got OpenCL up and running so I started my first tests. I calculated Pi using the taylorserie of arctan(1) (= 1/4 Pi) but I got an error in the number calculated by the GPU while they are using the same code!
int main:


 cout << setprecision(10) << setiosflags(ios::fixed) << setiosflags(ios::showpoint);

 cout << "Calculating Pi using GPU (1024 serie)\n\n";

 // Initialize Host application
 if(initializeHost()==1)
 return 1;

 // Initialize OpenCL resources
 if(initializeCL()==1)
 return 1;

 int iTime = 0;

 iTime = runCLKernels();
 // Run the CL program
 if(iTime==1)
 return 1;

 float fPi = 0.0f;
 for(cl_uint i = 0; i < width; i++)
 {
 fPi += output[ i ];
 }

 cout << "Pi is :" << fPi*4.0f << "\n";
 cout << "Time total :" << iTime << "\n\n\n";
 system("pause");

 // Releases OpenCL resources
 if(cleanupCL()==1)
 return 1;

 // Release host resources
 cleanupHost();

 cout << "Calculating Pi using CPU (1024 serie)\n\n";

 iStart = GetTickCount();

 fPi = 0.0f;
 int iCount = 0;
 int iTotal = 256 * 10000;
 while (iCount < iTotal)
 {
 fPi += pow(-1.0, iCount) / (2 * iCount + 1);

 iCount++;
 }

 iEnd = GetTickCount();

 cout << "Pi is :" << fPi*4.0f << "\n";
 cout << "Time total :" << iEnd - iStart << "\n";

 system("pause");

 return 0;

Kernel


 __kernel void MainKernel(__global float * output)
{
 uint xid = get_global_id(0);

 uint iCount = xid * 10000;
 float fTemp = 0.0f;
 while (iCount < (xid * 10000 + 10000))
 {
 fTemp += pow((-1), iCount) / (2 * iCount + 1);

 iCount++;
 }
 output[xid] = fTemp;
}

Output:


Calculating Pi using GPU (1024 serie)

Pi is: 3.1415958405
Time total: 62

Calculating Pi using CPU (1024 serie)

Pi is: 3.1415963173
Time total: 406

Does anyone know what causes this error? It becomes bigger using when I make the serie longer.

Regards,
Iska

hazeman · ‎01-12-2010

FPU on x86 architecture internaly uses 80 bit arithmetic. GPU on the other hand is using all the time 32bits. That is most probably the cause of your problem.

To improve accuracy use double or more advanced techniques like quad float or double-double ( google QD (C++/Fortran-90 double-double and quad-double package) ).

Iska91 · ‎01-12-2010

I found that a float has a pretty low accuracy, but OpenCL (on ATI 4850) fails to build when I'm using doubles. Is there a way to fix this?

omkaranathan · ‎01-12-2010

Double is not supported currently

Iska91 · ‎01-12-2010

I noticed

But that means that I'm screwed and with that a whole lot of scientific applications that require a high accuracy?

eduardoschardong · ‎01-12-2010

The problem of GPU and CPU not getting the same results may not be precision (and BTW, CPU may not be using the 80-bit precision intermediary results) but the uggly fact that, when using floats or doubles, multiplys and adds are not comutatives, I mean, a + b + c may differ from c + b + a, when comparing results the algorithm is a bit different, in CPU you sum from first to last, in GPU you sum in blocks to then sum the results of the blocks, this little difference when done with floats is enough to yield different results.

Also, if you are trying to compute a lot of pi digits double precision will not be enough either and you will probably not like Taylor series anymore

nou · ‎01-13-2010

58xx has beta double support for simple +-*/ operation even i do not sucsefuly use it.

MicahVillmow · ‎01-13-2010

if you specify the cl_khr_fp64 extension, you will get experimental support for the basic math operations for doubles along with I/O.

Iska91 · ‎01-14-2010

Thank you for your replay MicahVillmow,

My code is like this ATM (kernel):

#pragma OPENCL EXTENSION cl_khr_fp64 : enable

__kernel void MainKernel(__global double * output)
{
uint xid = get_global_id(0);

uint iCount = xid * 10000;
double dTemp = 0.0;
while (iCount < (xid * 10000 + 10000))
{
dTemp += pow((-1.0), iCount) / (2 * iCount + 1);

iCount++;
}
output[xid] = dTemp;
}

But i got the following error:
C:\Users\Jasper\AppData\Local\Temp\OCL7D6A.tml.obj:fake: (.text+0xa4): undefined reference to '__pow_f64'
C:\Users\Jasper\AppData\Local\Temp\OCL7D6A.tml.obj:fake: (.text+0x184): undefined reference to '__pow_f64'
Error: Building Program (clBuildProgram)

Do you know what causes this error?

Thanks in advance

omkaranathan · ‎01-14-2010

Iska,

Double is only available as a preview feature. Only addition, subtraction and multiplication operators are available for use presently.

You can expect full support in a future release.

Please refer to KB article, http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=88 for detailed info.

Iska91 · ‎01-14-2010

Okay, got it

Looking forward to the next release

Edit:
It works

#pragma OPENCL EXTENSION cl_khr_fp64 : enable

__kernel void MainKernel(__global double * output)
{
uint xid = get_global_id(0);

uint iCount = xid * 10000;
double dTemp = 0.0;
bool bFlag = false;
while (iCount < (xid * 10000 + 10000))
{
if (bFlag)
{
dTemp += (double) -1 / (2 * iCount + 1);
bFlag = false;
}
else
{
dTemp += (double) 1 / (2 * iCount + 1);
bFlag = true;
}

iCount++;
}
output[xid] = dTemp;
}

Ty very much guys! U saved me

Archives Discussions

OpenCL inaccurate