Over the holiday period, AMD has been embroiled in a bit of controversy. Its all-new flagship GPU is overheating for some users, which has spurred the company to investigate. With most of the tech media on holiday, though, there hasn’t been much new information. However, overclocking wizard De8auer has dug into the issue using four GPU samples, and thinks he’s uncovered the culprit: the card’s vapor chamber.
As a quick recap, it’s been documented that some 7900 XTX owners are seeing GPU hotspot temperatures of 110C. This is the maximum temperature the card can reach, and it’s resulting in thermal throttling and excessive fan noise. That’s not how it’s supposed to work, as even AMD has acknowledged. In most scenarios, the hottest it should get would be around 90C, maybe. Usually, it should be more like 80C or thereabouts. Clearly, something is wrong. It was originally posited that perhaps there wasn’t sufficient mounting pressure. This could be a factor because it has chiplets and uses direct-die cooling. That doesn’t seem to be the case, however, as De8auer’s testing shows.
His latest video is a follow-up to his original testing, where he couldn’t reproduce the issue. For that, he used a single GPU he had on hand. To dig deeper, he purchased four cards from his followers that were experiencing the issue. All four GPUs were being used differently, such as horizontal versus vertical mount, in a case versus open-air test bench, etc. The first big discovery was when he tested the cards vertically mounted. None of them were able to reach the 110C hotspot that’s been in the news. However, when he changed it to a horizontal mount, two cards hit 110C in just minutes. He also noted the cards’ fans were spinning up to 1,000rpm faster in horizontal mount as well.
So what is it about the horizontal mount that’s part of the problem? The first theory was since the cooler is being pulled down by gravity in this orientation, it’s causing a minuscule gap between the cooler and the GPU. To test this, he laser cut an acrylic stand that would hold the cooler in place during testing. This eliminates any sag the GPU might have been experiencing. Interestingly, the acrylic stand had almost no effect on temperatures. The same two GPUs still hit 110C, so gravity was ruled out as a factor. He also ruled out the plate that sits between the PCB and the cooler as a factor as well. He also used longer screws to increase mounting pressure but to no avail.
Finally, he decided to flip the GPU from vertical to horizontal while it was running. Lo and behold, it went from behaving normally to 110C hotspot in just three minutes. What’s even more interesting is he then flipped it back to vertical and the temperatures remained high.
This leaves just one possible thing to blame: the vapor chamber. This is essentially a heat pipe, where liquid vaporizes when it gets hot, then condenses back into liquid when it’s further away from the heat source. This cycle repeats endlessly when the card is being used.
He theorizes it could be the pressure inside the chamber could be incorrect, or there could be an incorrect volume of liquid. Either that or something inside prevents the liquid from traveling back to the heat source. Either way, in De8auer’s eyes, it is the vapor chamber, as every other variable has been eliminated. This means it was manufactured incorrectly, so there’s no way to fix it with a firmware or driver updates.
Since it has now been acknowledged publicly here: https://youtu.be/X87OzJ3bU7o?t=165 , is there a tool to check the SN affected?
If not, what is the recommended route to get support through AMD's website?