Showing results for 
Search instead for 
Did you mean: 

GPU Developer Tools

Adept II

How to detect crashed GPUs and reset them

I am looking for a solution for a datacenter with AMD Gpus that I can use to detect if cards are frozen/ crashed and be able to reset them.

Sometimes I think the card is still functioning but doesn't output a display and sometimes it is fully dead so I would like to be able to detect either.

Something similar to Nvidias DCGM ?

Or is there a good way to write this manually, for Linux and Windows.