Does the ALU of HD 5870 support native 64 bit calculations? Does the following instructions use the same cycles?
long int x, y, z;
z = x + y;
int a, b, c;
c = a + b;
Thank you in advance!
Solved! Go to Solution.
Although add/subtract run at the full rate, multiply isn't so lucky.
The programming guide, section 4.13.1 has a bit about data type performance - note that these things aren't even really '32 bit' integer devices either, and a 32 bit multiply has 1/5 the rate of a 24 bit or float multiply.
Presumably all long ops are implemented using 32 bit ones.
Does the ALU of HD 5870 support native 64 bit calculations?
It is not native 64 bit, but can do 64 bit calculations.
Does the following instructions use the same cycles?
AFAIK Double Precision Operations need 5 times more cycles than Single Precision Operations.
--
Srdja
I believe a 64-bit integer add/subtract is emulated using 32-bit carry-add/borrow-subtract, so those long operations should take twice as many cycles as those int operations. In case you're wondering, 64-bit integer multiplication should take around four times as many cycles as 32-bit integer multiplication, and division would be even more expensive, all assuming emulation using basic 32-bit operations.
Although add/subtract run at the full rate, multiply isn't so lucky.
The programming guide, section 4.13.1 has a bit about data type performance - note that these things aren't even really '32 bit' integer devices either, and a 32 bit multiply has 1/5 the rate of a 24 bit or float multiply.
Presumably all long ops are implemented using 32 bit ones.
Just because an instruction takes longer, or runs at a different rate, doesn't mean it's not a native instruction as even instructions for an x86 CPU have variable execution time. 32-bit integer multiply is a native instruction in the GPU, the difference, compared to 24-bit multiply, is that not all instruction slots are able to execute the instruction, hence instruction throughput is reduced.
thank you smatovic for your helpful answer!
Thank you settle! Very helpful answer!
I have to use a simple 64-bit calculation. I think an on-chip emulation is more efficient than a home-made emulation in C-language
Thank you notzed! It is out of my expection that 24 bit multiplication is 5 times faster than 32 bit! I even wonder why OpenCL support the strange 24 bit format
Thank you Jeff. I agree. The boundary between hardware an software has been obscured. In my mind, however, a real 64-bit processor should complete a 64-bit addition in one step