I am an administrator of a Linux cluster with about 100 GPUs, and I need to find a way to monitor their utilization. For example, the fraction of time a GPU is actually busy, how much memory is being used, etc.
Is there a utility similar to nvidia-smi for Radeon graphics cards that would show GPU usage statistics?
Does OpenCL provide an interface that could be used to monitor GPU utilization? For example, can I poll CL_DEVICE_AVAILABLE provided by clGetDeviceInfo to estimate how frequently a GPU is busy?