rollyng

Multi 6990s with ACML-GPU 1.1.2 failed

Discussion created by rollyng on May 26, 2011
only 3~4 out of 8 GPUs reported working

Hi all,

I am starting this thread here because I would like to compare the performance of DGEMM with 4 x HD6990s (total of 8 GPUs) on caldgemm and acmlgpu.

The original discussion can be found in

http://forums.amd.com/devforum/messageview.cfm?catid=390&threadid=150586&enterthread=y

But I found that acmlgpu can only make use up to only 4 GPUs out of 8? Can anyone confirm my finding? Is this the current limitation? Thank you!

 

rolly@rolly-X8DTG-QF:/opt/acmlgpu1.1.2/GPGPUexamples$ ./dgemm_c_example.exe ACML-GPU example: DGEMM call -------------------------------------------------------------- Matrix A (3000 x 3000): 1.6416 1.7286 1.1754 1.4190 1.5516 1.1218 1.9234 1.8641 1.2852 1.0358 1.9557 1.2583 1.3752 1.7974 1.8278 1.4440 Matrix B (3000 x 3000): 1.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 1.0000 ERROR: gpu4 - unable to allocate minimum cached system (GART) memory gpu4 Total Available Last Request Local: 2048 MB 196 MB 1845493760 (1760 MB) ok Remote (NC): 1787 MB 1720 MB 0 ( 0 MB) FAILED Remote (C): 508 MB 463 MB 5242880 ( 5 MB) FAILED ERROR: gpu5 - unable to allocate minimum cached system (GART) memory gpu5 Total Available Last Request Local: 2048 MB 196 MB 1845493760 (1760 MB) ok Remote (NC): 1787 MB 1720 MB 0 ( 0 MB) FAILED Remote (C): 508 MB 463 MB 5242880 ( 5 MB) FAILED ERROR: gpu6 - unable to allocate minimum cached system (GART) memory gpu6 Total Available Last Request Local: 2048 MB 196 MB 1845493760 (1760 MB) ok Remote (NC): 1787 MB 1720 MB 0 ( 0 MB) FAILED Remote (C): 508 MB 463 MB 5242880 ( 5 MB) FAILED ERROR: gpu7 - unable to allocate minimum cached system (GART) memory gpu7 Total Available Last Request Local: 2048 MB 196 MB 1845493760 (1760 MB) ok Remote (NC): 1787 MB 1728 MB 0 ( 0 MB) FAILED Remote (C): 508 MB 472 MB 5242880 ( 5 MB) FAILED WARNING: 4 out of 8 GPUs failed to initialize; proceeding with other(s). Matrix C: 5.4629 5.7522 3.9114 4.7221 5.1634 3.7330 6.4005 6.2033 4.2767 3.4469 6.5079 4.1874 4.5763 5.9813 6.0825 4.8052 Time: 4 calls in 1.47 seconds, 146590 MFlops

Outcomes