cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

bayoumi
Journeyman III

Full Precision on Transcendental Math

Hi,
I tried a program that required using math operations such as exp, pow, log, ..etc.
They seem to work only with float types.
The brcc accepts double input & output streams for such operations, but during run time a get erroneous results?
Thanks
Amr
0 Likes
14 Replies

Hi Amr,

Currently, the transcendental functions such as sin/cos are handled natively by the hardware. The Radeon 3870 and FireStream 9170 have native single-precision transcendentals. What Brook+ is doing is casting doubles down to floats for the transcendentals and then casting them back to doubles. There may be some problem with that path and we are having the team take a look.

In a future release, we'll be introducing an emulation path for double precision transcendentals.

Michael.
0 Likes
bayoumi
Journeyman III

Hi Michael,
I tried to use something like:

y = exp((float) (x)); where y<>, x<> are double out/in 1D streams, and y= (double)(exp(x)) or y = (double)(exp((float)x);
x was in the range of -1.0 to 1.0
none of thes worked. I tried sqrt(abs(x)), pow(x,a), log(abs(x)). I believe the rule applies to all.
I also noticed that there was no log10 function. There are some apps in EE which uses it
Thanks
Amr
0 Likes

Hi Amr,

Can you try using a separate variable to convert from double to float, do the operation on a float to a float, and then use a separate variable to convert from a float back to a double? I'm just trying to isolate this and see if Brook+ is just misbehaving when doing the casting all at once.

i.e.

double a;
float a_float;
float b_float;
double b;

a_float = (float)a;
b_float = exp(a_float);
b = (double)b_float;


Thanks!

Michael.
0 Likes
bayoumi
Journeyman III

Dear Michael
I tried several tests:
1- The test you requested works fine with the exp() function using intermediate variables, and I assign the result of b after type casting to an "out double b_out<>" without problems.

2- To test type casting on streams, I used an input stream "float a_in<>" and I assign it use to type casting to double :
kernel void (float a_in<>, out double b_out<>){
b_out = (double)a_in;
}
This works fine as well

3- I use
kernel void (double a<>, out double b_out<>){
b_out = exp((float)a);
/* or b_out = (double)exp((float)a);*/
}

and this does NOT work.
It seems there is a problem with the math function type casting(probably function prototype or so), and NOT with the math function itself and NOT with type casting on streams

0 Likes
bayoumi
Journeyman III

To test type casting from double to float, I also tried :

kernel void (double a<>, out double b_out<>){
b_out = (double)((float)a);
}

and this works fine as well
Amr
0 Likes
bayoumi
Journeyman III

Hi Michael,
I was able to more tracking for the problem. It is the type casting from the exp() fun output, not input.
If you use intermediate "dummy" float streams for the output, then you do a separate type casting, everything works.

kernel void (double a_double<>, float tmp_float<>, float out double b_double_out<>){
tmp_float = exp((float)a_double);
tmp_float = exp((float)a_double);
}

everything work OK.
If you try to use:
tmp_float = exp(a_double);
or
b_double_out = exp((float)a_double);
or
b_double_out = (double) exp((float)a_double);

Then NOTHING works.
So for now:
the use of a dummy float stream + type casting the input stream at the function input from double to float
is the only turnaround I was able to find
Thanks
Amr
0 Likes

Another alternative is to brew your own double precision transcendentals. If you really need the precision, then this seems like the only option at the moment. I was able to port the log2(double) function from the Cephes library to IL with good results. Of course, integrating this with Brook+ is another matter altogether. That's one of the reasons I moved to CAL.

Lukasz.

0 Likes

Hi Lukasz,

I have let the engineering team know that we need double precision transcendental functions for Brook+. I'm trying to get that project into the schedule.

Michael.
0 Likes

Hello Michael,

how far is the work on the double precision transcendentals ?

They are really important for scientific problems.

Having only mult for double is helpful but not the aim of stream.

Scientific problems need double -> hardware can ..

But now software can not use exp(), log(), sin(), cos() etc ..!

When we can ? 🙂

ps: hoply with sdk2..

Best wishes

Marek

0 Likes

Yes I'm waiting for that too.

And if possible more feature for brook+ (although that won't be included in sdk2, sigh)

0 Likes

Any news from that side ??

0 Likes

Hi,

after i read the release notes to SDK 2.0 beta3 i don't expect any

double exp(),sin() functionality for GPU in 2009...

In the FAQ is the information that double precision is optional in OpenCL (i think AMD/ATI forced it, as OpenCL partner because it seems that there is a bigger problem to realize this option ...).

So my hope died that this will be possible in the near future. Maybe there have to be a hardware fix (i guess which even is not done in RV8xx)..

Thats really sad, GPU power have such potential for scientific tasks, but we can not optimize our programs because there is no support till now ...

Best wishes

Marek

0 Likes

Hello,

because there are no comments about future support of i.e. DP exp() i googled a little bit to find hints where the problem could be.

I found something about HPC 64-bit exponential function implementation in FPGA (especially for scientific demands):

http://www.springerlink.com/content/n553027524j05066/

and other articles about lookup tables.

Now i guess that no gpu (up to rv8xx) contain a DP lookup table for exp() sin() etc ...

So only expensive softwareside solution can be used on current gpus with probable worse performance ..

I would be very happy about any reply of an ATI Developer to that topic ..

Best wishes

Marek

0 Likes

Originally posted by: NurEinMensch Hello,

because there are no comments about future support of i.e. DP exp() i googled a little bit to find hints where the problem could be.

I found something about HPC 64-bit exponential function implementation in FPGA (especially for scientific demands):

http://www.springerlink.com/content/n553027524j05066/

and other articles about lookup tables.

Now i guess that no gpu (up to rv8xx) contain a DP lookup table for exp() sin() etc ...

So only expensive softwareside solution can be used on current gpus with probable worse performance ..

I would be very happy about any reply of an ATI Developer to that topic ..

Best wishes

Marek

I'm not an ATI developer, but at least I've implemented my own exp() in IL for my code. It works more or less the same as it is done on a CPU. Those lookup tables are also not stored in an onchip ROM or so, they are simply provided by the software. exp() is quite expensive on CPUs, too.

Furthermore, one doesn't need the lookup tables at all (but one could use the constant buffer for it, if one wants). There are other implementations using the quotient of two power series of quite low order (3, if I remember it right). So one just needs an argument reduction, a few constants for the power series, the division (which is also done in software btw.) and the ldexp instruction (which the GPU hardware is capable of). There are several different implementations out there using this scheme and it works also on GPUs. Maybe it is not the fastest possible algorithm, but it isn't that slow either if you compare it with the CPU.

0 Likes