First things first: I know this used to be a common problem. I spent a ton of time researching different cases, but mine seems different.
I got a Limited Edition Vega 64 card with a used PC I bought several months ago. It behaved really well at the beginning, but then the system would occasionally get stuck at a 100% GPU load when playing Witcher 3. Exiting the game or restarting the drivers wouldn't help. Even if nothing else is running and task manager shows 0% GPU usage - the will be running at a 100% and will not stop until the PC is restarted. The problem is that it would also develop color artifacts on the screen when it happens. Then for a little bit this problem went away, until it resurfaced a few weeks ago.
For the last 3 weeks or so the card has been extremely unstable. It would hit a 100% and develop artifacts almost every time I run CSGO. Then it started crashing and resetting the WattMan settings. Now if you see artifacts - it means the card will probably reset itself in a few seconds and both monitors will turn off for a couple seconds, before turning back on.
The biggest problem for me is that recently it started crashing not only in games, but in Solidworks and Keyshot that I have to use for work, which has become completely unacceptable. Sometimes it would suddenly lock to a 100% and start the artifact disco when the only open applications are Chrome or Netflix.
I tried different WattMan settings, including increasing power limit and power save. I tried every driver version between August and December. It is not a mining trojan and It does not seem to be a ReLive issue like in some cases I've seen online. The PSU is pretty new and definitely using two cables to power the GPU, so I would rule it out too.
I am trying to figure out if there is anything i can do aside from spending a bunch more money on a new, inferior GPU (i got a really good deal on this PC as a whole, but i would definitely not be able to buy another high-end card). It seems like it is deteriorating but what could be the issue? Failing HBM? Why does it still lock to a 100%? Can the card be repaired? Is it worth anything as is?
Specs:
- Ryzen 7 2700
- Vega 64 LE
- 16GB DDR4 RAM
- 850W Gold EVGA PSU
nothing else is running, task manager shows 0% GPU usage,