osirismelb

RX5700/XT Instability Issues related to over temp and no Global Wattman sensor data

Discussion created by osirismelb on Jul 22, 2019
Latest reply on Jul 30, 2019 by pcp_1989

So one thing I have noticed in my case of instability issues is it's directly related to temp of the GPU and the fact Global Wattman stops running and doesn't manage the fan speed when putting the GPU under load.

Every now and then Global Wattman will stop working, ie is completely blank or the fan curve is flat and doesn't go over 20% regardless of the GPU temp.

As a result when the GPU goes under load the temp rises but isn't managed by increasing fan speed and eventually the GPU overheats and crashes either resulting in a black screen or the entire PC will shutdown or reboot.

Last night I noticed again that the temp sensor in the overlay both for Global Wattman and MSI Afterburner were showing 0 for temp and GPU load meaning the sensor data was broken. I also noticed that this happens often happens when loading a DirectX 12 enabled game. Using DirectX 11 games/apps such as Unengine Heaven are usually ok but anything DX12 will stop sensor data from working resulting in ZERO temp management.

I quickly stopped playing F1 2019 before the GPU overheated and loaded up Afterburner to confirm indeed sensor data was broken so I decided to manually bump up the fan speed to around 80% which was noisey but allowed me to continue to play for a good hour or so with no crashes because temps were kept at a reasonable level.

I have also noticed sometimes the temp sensor data will start working again once the offending DX12 App/game is stopped. You can clearly see Wattman stop reading temp and GPU load data and then start again once the app is closed down. Afterburner shows the same in it's monitoring so this is definately a driver issue and I am pretty sure the instability issues (at least the ones I am having) are directly related to no fan management due to lack of temp sensor data and the GPU overheating.

NOTE: I have tested this with just Global Wattman running, and/or with MSI Afterburner running as well. I've been using the latest Afterburner beta but the latest public release also shows same temp data issue.

Outcomes