All,
I did some testing based on other reports. Roughly 10 minute sample using AMD adrenaline and haven on a single monitor.
Computer Type: Desktop |
GPU: AMD 7900xtx |
CPU: Ryzen 7900x 12 core 24 threads with 360 AIO |
Motherboard: ASUS Tuf Gaming X670E-Plus Wifi |
BIOS Version: 0821 |
RAM: G.SKILL Trident Z5 Neo RGB Series AMD EXPO 32GB (2 x 16GB) 288-Pin PC RAM DDR5 6000 (PC5 48000) Desktop Memory Model F5-6000J3038F16GX2-TZ5NR |
PSU: Super Flower Leadex Platinum SE 1000W 80+ Platinum (2 8 pin cables running to GPU from separate conenctors) |
Case: O11 EVO Dynamic 6 120mm Intake, 3 120mm exhaust (AIO), 1 Rear 120mm exhaust. |
Operating System & Version: WINDOWS 11 Pro h22 |
GPU Drivers: Up to date 22.12.1 Released 12/8/2022 |
Chipset Drivers: amd_chipset_software_4.11.15.342 |
Background Applications: DISCORD, CHROME |
Benchmark Applications: Haven |
Testing Monitor: 1080P 60HZ 24 inch Samsung connected over HDMI |
Samples taken at 2 second intervals using AMD adrenaline software for ~ 10 minute. first 5 samples and samples over 300 discarded |
Data and graphs: https://docs.google.com/spreadsheets/d/1XqOova-91rBiPvIFQx4e9pqCrW2hts0rdpXAd4Qj_ig/edit?usp=sharing
'
Solved! Go to Solution.
We have an existing discussion on this topic, please that to track future updates.
Thanks for this, that last picture should be first, so people understand what they're looking at when they're looking at it.
Good info though
Done, and re-organized.
Thanks for making this. What a disaster of a situation.
It's called the bleeding edge for a very good reason.
Sometimes you get cut.
The alternative of not getting one at all months later, like those of use who decided to hold out and see how things played out on other hardware like the 4090 and getting nothing at all is worse.
I just can’t get why this is happening? Something wrong with vapor chamber or cooler?
well the odd mounting position fixing the issue implies the weight of the card's shroud is what's changing the heat profile
Funnily enough, the way the best temps are recorded are the same layout you'd see for a testbench, mobo flat.
Which *could* easily lead someone to conclude these were not adequately tested in normal pc layouts
One way to find out, if someone were brave enough to do so, would be to *lightly* pinch the back side of the GPU near the GPU chip near the PCIe slot with your fingers while it's running a benchmark.
If the mere weight of the shroud were to be enough to change the heat profile, then the pressure of you *lightly* (again can't stress lightly enough) pinching the area where the weight would be going should be more than enough to prove or disprove that
My thought was that turning it upside down would have a similar impact.
I have it mounted 90 degrees, which would be like you putting your case on its front panel, with the Motherboard IO facing the ceiling, same issue that way, the only way I've seen that it doesn't seem to have the issue, where the pc can still be used is with the mobo horizontal in a testbench position like this
So does this mean the card will respond well to a custom water block?
My theory is that it's most definitely a vapor chamber issue.
The vapor chamber is running into "dry-out" where fluid isn't going to the hottest points of the GPU and that causes the increase in junction temperate and the delta between it and the average. Evidence for this is the consistent heat soak and the hotspot keeps rising at too much of a consistent time period. You can see this in the first graph, once it reaches a point the dry-out happens, and the vapor chamber doesn't effectively work to dissipate the hotter points on the GPU die.
This issue looks to be very common (It's all over Reddit) and repasting and refitting don't seem to help. As for why the orientation helps, it probably eases the work the vapor chamber.
Machos sense, gonna see about an rma then.
If you feel like testing it, you can try an orientation like this.
Where the front part of the case is lifted by something, so the GPU ends up at around a 30-degree incline with the CPU core being lower. It's more of a personal curiosity. No idea if it will amount to different results, but if it does it could point to say gravity not being the issue. Maybe the vapor chamber would work way better too.
That's quite some extensive testing, thanks for taking your time to share the results. Fortunately, my reference XTX seems to be working as intended and junction temperature as well as fan speed is at the level as they should be at a "normal" card.
That is a good data point.is there a serial number or manufacturer date? Perhaps we can track it down to a run
That's extremely interesting.
If you could, try running something in UHD using the scaling features in Adrenaline, see if it does it then. it only takes about 30 seconds running Civilization VI for it to reach 110 for me so you'll know right away.
If you have one that actually doesn't have the issue that's pretty much a confirmed manufacturing issue imo
I have been playing a couple of games since I got the card, I run 3440x1440, pretty much most games maxed out. Just for the sake of it I even ran Cyberpunk 2077 with full RT, while switching FSR to off, even though the framerate is often just below 30 fps, the temperature of my card typically is around 60°C GPU and ~77°C junction. Fans are spinning at around 1750 rpm.
I also did a FurMark test run just a few minutes ago: 3440x1440, 4xMSAA. After over 15 minutes of runtime, the temperatures settled at 61°C/71°C, fans spinning at over 1800 rpm.
Temperatures and fan speeds on my card are fine, at least what I would consider should be expected of the reference card. The only issue I got it doesn't like any sort of undervolting, even 1140mV is not stable and causes the driver to crash sooner or later. Really disappointing.
Edit: By the way my system is build in a Corsair 680X case, with the fron glass panel taken off for slightly better ventilation, also all fan slots are equipped with fans. So I am running a case with fairly good airflow, especially for GPUs.
Personally I don't think it is a general issue with the cooler per se, otherwise virtually every card would be affected. But that's not the case, I am also active in a German hardware forum and there are a few other reference XTX owners there who report no obvious temperature issues with their cards. Two reports there so far about the 110°C hot spot temps, one user was able to dramatically improve the temps of his card by tighting the screws of the cooler.
I'm putting this anywhere I see discussion on the 7900xtx overheating issues:
This person did have this post under the "RMA Refused by AMD 7900xtx junction temps" reddit post, but it appears he either retracted or removed it, or someone else did (you know how reddit mods can be)
https://www.reddit.com/r/PowerColor/comments/zrzcfc/7900xt_7900xtx_junction_temperatures/?utm_source...
I will caveat, I have no idea of the legitimacy of this person or their relation to the company(s) they claim relation to, but multiple of the people I have known to be usually correct have recommended me to pursue this information
I contacted him and gave him the info. At least somebody seems to be trying to reach out.
Reddit and mods deleted my post.
I have no idea why, I’ve asked for guidance on what I could do to re-post because I think it was valuable content.
It's sad that mods on Reddit are deleting such information. According to some hardware news services, AMD already started to investigating an issue, the shared screenshot above from PowerColorSteven seems to further support this information by stating AMD is asking their partners for lists of defective cards.
Of course it is always hard to get a public idea about how widespread the issue really is, people are more inclined to post and comment about a too hot running card vs saying their cards works just fine. But from my impression, the issues are indeed more widespread than just a tiny number of unfortunate particular cases. So the more data is gathered, even on AMD fan Reddits like /r/AMD oder /r/Radeon is going to be helpful, so I don't know why mods should delete them.
For AMD it is important not only to figure out what exactly causes the issue, but also provide a good RMA solution, especially for those who have to deal with AMDs own shop, which in Europe is handled by Digital River, who pretty much refuse seeing these as warranty cases and hardware defects. Once larger media outlets and Youtubers start reporting this on a bigger scale, it will be a PR problem for AMD, especially after they joked about the burning adapater issues from another GPU company...
We have an existing discussion on this topic, please that to track future updates.