I want to run two mpi process on a node with a hd4870x2. Each mpi process call dgemm. I tried to set environment ACML-GPU to 0x1 and 0x2 in mpi process to decide which GPU is used. However, this method can not work:
noderank = 1
gpumask = 1
noderank = 2
gpumask = 2
ERROR: unable to allocate minimum GPU memory
Total Available Last Request
Local: 1024 MB 1001 MB 16777216 ( 16 MB) FAILED
Remote (NC): 1233 MB 1211 MB 0 ( 0 MB) FAILED
Remote (C): 489 MB 489 MB 0 ( 0 MB) FAILED
ERROR: unable to allocate minimum GPU memory
Total Available Last Request
Local: 1024 MB 1001 MB 16777216 ( 16 MB) FAILED
Remote (NC): 1233 MB 1212 MB 0 ( 0 MB) FAILED
Remote (C): 489 MB 489 MB 0 ( 0 MB) FAILED
ERROR: Failed to initialize GPUs
ERROR: Failed to initialize GPU(s) for DGEMM
Warning: libCALBLAS is not reentrant!
Multi-threaded use of this library is not supported.
When GPU-accellerated ACML routines are called on multiple threads
concurrently, the requests will be executed serially, even if multiple
GPUs are present.