Hi,
I have a Sapphire HD 6790 card which according to the AMD Specs does have double precision capabilites:
http://www.amd.com/us/products/desktop/graphics/amd-radeon-hd-6000/hd-6790/Pages/amd-radeon-hd-6790-overview.aspx#3
I installed acmlgpu1.1.2, and Catlyst 11.6. When I execute "GPGPUexamples/Info.exe" I get in the gpu part:
gpu0:
Type: CALtarget(17) (unknown type)
Revision: 20
Maximum resource 1D width: 16384
Maximum resource 2D width: 16384
Maximum resource 2D height: 16384
Local GPU RAM: 1024 megabytes
Uncached remote GPU memory: 1788 megabytes
Cached remote GPU memory: 508 megabytes
GPU device clock rate: 840 megahertz
GPU memory clock rate: 1050 megahertz
Wavefront size: 64
Number of SIMDs: 10
Number of shader engines: 2
double precision: Not supported
local data share: Supported
global data share: Supported
global GPR: Supported
compute shader: Supported
memexport: Supported
calResCreate pitch alignment: 256 data elements
calResCreate address alignment: 256 bytes
Unaligned Access Views (UAVs): 12
3D program grid: Supported
Note the "not supported" for "double precision". Running "/gfortran/examples/performance/time_dgemm.exe" I get:
"Warning: no suitable GPUs found for double-precision GPGPU operations."
What's going on here?
Best regards,
Jonas
I suspect that is a serious copy-paste mistake. Only the high-end GPUs support DP precision. When this is the case, compute power is always listed seperately for SP and DP. HD 6790 does not support DP, even if it is listed on the website. A dire mistake indeed.
HD5850, HD5870, HD5970, HD6950, HD6970, HD6990 are the only graphics cards that support DP operations.
First of all, thanks you the quick reply!
That is very sad. I chose the card because I wanted to make some numerical experiments with the GPU. Guess I'm stuck with my CPU now ;-(
Regards,
Jonas
While it is very frustrating if one choses something in hope for a feature, which turns out is not present, using OpenCL for CPU computing might still be worthy, as AMD compiler is very adept at vectorizing code with SSE instructions.
My findings are is that it outperforms gcc about 10 times on an Intel Core-i7 when using serial code vs. OpenCL kernels.
jrauch,
Sorry for the inconvenience caused to you. I will ask someone to remove the Double capability from that page.