I'm turning to these message-boards because I've already contacted AMD-support and the response made it clear that they didn't bother to read my detailed message and concerns as evident by their contrarian questions and instructions. - So I'll post my concerns and questions here instead. It's going to be rather long and detailed, but that's to prevent most of the standard follow-up responses and questions that usually come when posting a single out-of-context statement. I'll explain the key points of the scenario instead:
So I've been using a stock/reference AMD Radeon RX 6950 XT for a couple of months now. - I have been using it in stock configuration. That is, no tweaking or overclocking with any software, not even Radeon Settings, because I'm using it on Linux (Pop!_OS) with the drivers that it uses automatically.
It has been working fairly well, although I've been questioning some of the performance, but I wasn't sure if that was just an effect of expectations. - However, in the last few weeks I've been using it to run 'Starfield', and as you've probably heard (or experienced), it's rather demanding. - But what it also did is break some new temperature-records on both this card and the CPU I run with it (the Ryzen 5 5600, also stock, with a good 120mm AIO).
Usually, the GPU would struggle to reach 110 degrees Celsius with the more demanding games, and the CPU would barely break 75 degrees even under full load. - But now, I've had the GPU reach 112 degrees, and the CPU as much as 86 degrees. (I'll include a screenshot, and you can find the GPU-temperatures under "amdgpu" of course, but the CPU is under "k10temp" between the "nvme" data.)
I know the CPU is probably fine, as it probably has about 10 more degrees of headroom. But as it says in the monitoring-software for "junction" (of the GPU), "crit" is 110 and "emergency" is 115. - Now, I'm not sure what that means and whether that's accurate, but I'm assuming that "crit" might mean it starts to throttle and "emergency" might mean it would shut down. - Am I correct in thinking that?
If it DOES mean that, considering my GPU can reach above 110 degrees, does that mean that the GPU is actually throttling at that point? Or does it not do that?
If that IS the case, should the card be cooled better and what could I do about that?
As for my case-configuration, it's already pretty much ideal. - I have two 140mm intake-fans immediately blowing at the videocard (which sits upright), so it gets fresh air constantly. Then there are two similar exhaust-fans in the top, plus one 120mm fan in the back that blows through the CPU's AIO-radiator. - The front, side and top are all perforated, as well as part of the bottom. So it's as arid as can be and there's no CPU-heat that stays in the case as it goes out through the radiator immediately.
The motherboard is a B450 Mortar Max, now with the latest BIOS, for more context.
- - - - - - - - - -
Then I have some additional concerns and questions when it comes to the GPU:
In terms of power, all the information I have for what my particular card draws at most is 286 Watts with 1.212 Volts, as you can see in the screenshot of the monitoring-software. - I'm simply wondering: Does that look right?
Because I read on AMD's official specification-sheet that the "typical board power" is 335W, though that might not mean the same thing.
Also, they recommend an 850W PSU and I'm running a 750W PSU (ToughPower GF1), though I've read in other places, like TechPowerUp and VideoCardz that even 700W is enough, with some actual consumers reporting that it works with their 750W PSU as well, so I didn't bother changing it. - I had already look into this thoroughly multiple times and concluded that it should work. It does run and never shuts down, but I don't know if that means it's enough. - I've also used two separate cables instead of a daisy-chain, just to be sure (and more to prevent overloading anything).
One thing I must note is that, with 'Starfield' now, I've seen "glitches" that resemble artefacting. They are those brightly colored (white, green, magenta, yellow, cyan?) striped squares that occasionally(!) flash on the screen once in a while in very demanding areas of the game. - I worry that it's actually "overloading" the GPU and not working properly with it. - Even though I've only used this GPU for a few months now, I've not seen that happen before, and I'm guessing it's due to the heavy load and especially the heat.
Of course this might be very title-specific, as there have been instances of singular titles doing something very specific to hardware. But still, how is anything pushing this card to become critically hot in an ideal environment and its stock configuration (that includes the fan-speed)?
One more peculiar thing I want to mention is about the clock-speeds: I can only find on spec-sheets that it will boost to about 2300MHz. But I've seen my card boost as high as 2800MHz, and consistently at that. - Again, this is not overclocked but stock. - Is it just a good chip and because of that running hotter than expected? Or does it perhaps get the wrong instructions or something like that? - Again, I'm running Linux (Pop!_OS) and whatever drivers it installs automatically. I have not changed anything about the GPU and CPU.
- - - - - - - - - -
There two things I can think of to do myself:
Perhaps I could try and undervolt it (?) with a program called CoreCtrl (which is an alternative to Radeon Software/Settings for Linux), but I'm having trouble installing it, so I haven't been able to use it so far. - But then I don't know if that would only make it less stable.
I could try to re-paste the 6950XT, which I've done with the RX5700 I used previously and made it run cooler, but I worry about making it worse. - Also, when idle, it's only at about 40 degrees, so that seems OK.
I know AMD doesn't or can't officially recommend these methods, but if there's nothing else, I might have to.
Please let me know what looks OK and what seems off about all of these details, as well as anything I can do.
Hey man, same thing here. I don't want to repast it because of the warranty sticker, honestly not sure that RMA is gonna solve the issue since looks like getting this temp is common. What are you gonna do?
I just water-cooled my reference RX 6950 XT and it was a thermal sheet (NOT PASTE) on the GPU. The warranty stickers are technically illegal in the US if you live here but up to you. I' at least try to put some MX-6 or something similar IMO. I bought mine like 3 months ago so I don't know if maybe things are/were done with paste, but I can send a pic if you'd like. (Oh, and it was a bit off center to where you could see some of the bare copper that should have been in contact with the die) Went from 75ish GPU and 110ish hot spot to 50ish GPU and 75ish hot spot after lengthy 1440p ultra/high settings gaming with 360 radiator custom water loop while pulling 340W.