RX480: Poorly implemented throttling leads to GPU malfunction

Question asked by empty_knapsack on Nov 8, 2016
I've been developing GPGPU software for the last 8 years (and had plenty of posts within old devforums btw), used/tested tens of GPUs during this period but only newest RX480 is de facto non-usable from very beginning. With one specific GPU kernel my RX480 (Sapphire version with AMD's reference cooling system) starts producing incorrect results just in 20-30 seconds from the start. GPU Temperature goes from ~35C to 80C, then it jumps around 80C -- I guess drivers/hardware tries to keep GPU core at 80C but fails, so there are spikes up to 90C. After 10-20 minutes of such "work" whole system shutdowns. Not sure is it GPU hardware decides to turn off everything or motherboard itself.


I've created a separate executable which contains all necessary kernels (kernel #2 causes problems, #1 and #3 works more or less stable) to recreate this situation which is available here:

Screenshot as example:


So, questions are:

1. Is this issue related to only my RX480 or it's common for all Polaris family GPUs?

2. Is there any way to downclock GPU core with current drivers? Back in 6990 days it was possible to tune GPU frequency and even power usage, now I see nothing useful in ATI control panel.

3. What is general recommendation to our users here? Avoid buying Polaris GPUs?