I have paired a new Ryzen 9 5950x with an Asus Crosshair viii hero, but I am suffering really high temperatures. at idle the processor is running at 75C. Under very light load goes up to 80C
I have tried lot of things and just for clarity I am very experienced in building machines (over 25 years).
I used a Corsair 100i platinum cooler and after several attempts to cool ( 4 reseats using 3 different CPU thermal pastes) I came initially to the conclusion it was the cooler at fault.
Wanting to use I bit the bullet and bought a replacement cooler a Kraken X73, I installed and net result was 1C cooler i.e. 74C. So not the cooler but something else!
I did some searching online to discover others have been having a similar issue multiple motherboards, I found some suggestions to help which were to switch my board into an eco mode (which did nothing) the next suggestion was to disable boosting (which kind of cripples the chip) , this I did and now see temperatures in the region of 45C. I don't know where the problem lies exactly, but if several manufactures are seeing then it kind of points to an AMD issue with the Chip or something they have supplied to the board manufacturers. After days of building and then re-applying the coolers I do feel a little cheated. I tried raising with ASUS but because I registered my motherboard for cash back it is saying the serial number is already registered so cannot raise a support case. I coming on here hoping that someone can give some advice and maybe someone from AMD can help.
" any of you are familiar with DC circuits you should know it's the current that introduces heat and there is no way to control the current (A) under a light load. Only max all core limits."
By setting the TDC you set the absolute limit of sustained amperage that the VRMs will supply the CPU. It doesn't matter if it is a lightly threaded or heavily threaded workload, the VRMs won't go above that. So if your cooling solution and VRMs can handle that amperage you are fine.
I suspect the issue arises with PBO simply turned on to motherboard limits. Some of the X570 motherboards have crazy limits that probably are allow some pretty big spikes in amperage when a load is introduced before rapidly scaling back. It may be worthwhile to set the PPT/TDC/EDC systematically until you reach a point where the errors/instability returns, or you reach a temperature/voltage you like.
5950x is very different compared to 3700x in that regard. Even with my 5800x O.C. I was never going over 76°C!
With 5950x stock, I throttle at 90°C in games...
this is true although it doesn't explain why many of us here including myself are experiencing issues when on stock settings with pbo disabled and why many people including myself are not able to reach the rated boost speeds and are somehow operating at a much higher temperature than those seen in the reviews. The current limits are only there to limit the entire package or entire package minus SOC.
This is not a case of people leaving PBO on and at motherboard limits as I have verified the exact same behaviour with the cpu at stock settings (confirmed by ryzen master showing "OC Mode" As "Default" on launch without any further configuration required meaning the bios settings are forcing "stock" behaviour.
There is something wrong with the boosting behaviour of this CPU. You can confirm this easily yourself by going for a manual OC. If I can get the same frequency at a much lower temperature it will clearly show that there is something wrong with the boosting algorithm right?
Well.... the proof is right here:
Notice how My system is pulling far less current here? Goes back to what I mentioned earlier.
Also just to re-confirm. I am running a custom EK loop with 2x 360mm Rads. Yup... custom watercooling and I have issues too.
Here is proof I don't get the rated boost speeds on stock: https://imgur.com/8J9Yrqa
Notice how my single core load temps are extremely high as well?
Here is how my system sits after just 5 mins of gaming: https://imgur.com/L8e1ytL
(Still haven't hit anywhere close to 4.9GHz boost on effective clocks) Click here for Full size img: https://i.imgur.com/L8e1ytL.jpg
Remember my PBO results from earlier? well compare them against this video: https://youtu.be/EtrCJn6Fr3o?t=693
(Should be linked to the R20 run he does, settings are at the beginning of the video) and my Bios settings are very similar to his, he also only has 1x 360mm rad keep in mind.
And there we are.. all the proof we need that something is not right with these CPUs on some motherboards. Keep in mind that some people have swapped mobos and seen a dramatic improvement. There was also a user here that swapped CPUs with a friend and his 5950x with "problems" was fine on another mobo and his friends 5950x which was "fine" experienced the same issues we are all having here on his mobo.
I understand people will instantly say, oh you have bios settings wrong and I would normally agree. However after spending hours upon hours trying to solve this myself and confirming I still experience this on stock behaviour with PBO disabled and don't even get the rated speeds I can confirm this is not a case of incorrect settings but a case of incorrect boosting behaviour experienced in certain conditions. The common factor here seems to be ASUS motherboards.
Again we need to somehow alert AMD / ASUS of this.
" If I can get the same frequency at a much lower temperature it will clearly show that there is something wrong with the boosting algorithm right?"
Not true. you can get clock stretching at a fixed voltage that isn't actually indicative of higher performance. The boosting algorithm will try to supply the voltage/amperage necessary to do any workload at the rate clock speed (AVX included). Can you complete OCCT small batch/extreme/constant with the manual setting? What clock speeds do you get?
But yes, if the processor runs hot at stock settings, it is likely either an issue with the installation of the block or the IHS of the CPU is damaged and not pulling heat from the cores efficiently. In the latter case, only an RMA would solve that.
There is no issue with the installation, the IHS I can't confirm but surely very highly unlikely given so many of use have raised this issue, the boost behaviours are at fault and the common link is ASUS motherbaords. you can find countless threads all over the internet. Mostly all with ASUS motherboards. There are people in this very thread @Ero_Sennin for example who have mostly resolved their issues by moving away from an ASUS motherboard. When on manual I have clearly demonstrated there is a massive 30% change in current draw from manual to boost settings.
Why does the CPU not pull 200A in manual?
(1.2v 140A results in stable 4.4GHz all core (by this I mean I can run day to day and not run into any issues), why even with a negative curve offset does the cpu pull 200A at 1.34v when using PBO? That is 168W vs 268W a whole 100W of power extra is drawn to achieve the same clock speeds by the boosting algorithm, this is massive.) This is in my eyes obviously nothing to do with the mount or the IHS.
Why am I able to find people quite easily getting great results on every mobo other than ASUS?
Surely it can't be this much of a coincidence. There are 4 threads on this forum I've found so far, at least 1 I've found on ASUS' forum and pages upon pages of people raising their concern on various forums if you type "5950x Temperature" into google wherever the motherboard is stated its usually ASUS.
According to this I have an mid-entry level cooler. Not a full custom loop from EK. if this is the case I could have saved myself quite a lot of money.
What Mobo do you have and what temps and frequencies are you experiencing under a single / multi core load on R20?
Disregard this, I can see you have specified this while I was writing the reply
It seems that for all the motherboards in first place, were too early to handle rightly the new Ryzen 5000 series CPU's. I am also from the first users that got the 5950x after its release. I got it in November 2020. But yeah definitely the motherboard had a problem. And now that I mentioned this, I want to say that after almost 3 months Asus accepted that the motherboard was defective.
And while more than 1 month of waiting for passed, the shop I bought the Asus motherboard from, will refund me completely the money because I refused a new replacement while I had bought already the MSI motherboard. So maybe some of you that faced issues like me could do the same.
Now I am very positive that if I had use custom loop much better than my 160€ CoolerMaster AIO, I could push the CPU more and get greater results. But for the mid tower case and the AIO I use , what I found stable for me it's the current BIOS settings I use that provides me max temperature in stress 80c and about 4600-4625MHz in multi core boosts. Single thread though reached 5GHz to 5.1GHz.
I can share my settings with anyone if he needs. As I mentioned before, I have used DRAM Calculator to reach the best settings about timings and voltages for my 3600MHz RAM CL16 and then manual PBO limits and Curve Optimizer negative per core values.
If I use stock settings with PBO enabled, the CPU in the PassMark will give higher score but it will also reach 90c and then it will throttle for safety. But this process may leads to spikes while gaming etc. so I preferred to avoid that and use the settings I mentioned after a lot of testing and reading in articles.
I have had mine since Nov 2020 also I purchased on release day.
I have also found something else in the past. This is what happens is I apply a -0.1v offset to the core:
Why is 1 of the CCDs acting different than the other, this is just a whole CPU voltage offset. This further confirms possible BIOS issues in my eyes.
I can't say something 100 accurate regarding this because I don't use Ryzen Master. I did once but I didn't really like it. If you ask me, the best thing is to give your self manual tests with patience. Yeah I know it takes time and it's a pain in the... but it seems this is what really worked for me. Neither Ryzen Tuner couldn't help me.
So I had to first find the best settings for my RAM and make it stable and then find my best cores of my CPU and try playing with Curve Optimizer per core which I did. And after that I tried ran 2-3 times the OCCT test for both CPU and RAM and the 1 hour test passed.
Until now I don't have any issues so I have to see in time if it continues like this.
I only use ryzen master for quick changes before committing to make changes in bios and monitoring as the temps and clocks are the most accurate without needed many different apps open. I don't use it for tuning (for obvious reasons).
Yes I understand. But since you don't have to change any voltages and if you change the curve by just an offset of 5 in every step, you don't have to worry for anything about damaging the CPU.
I know I don't have to. I am just demonstrating there are clear issues with how ASUS boards handle voltages and current. as I've said from the beginning, we need to get in touch with ASUS because it is simply not correct and I have shown it clearly... also many people experiencing these issues have ASUS boards.
"I have also found something else in the past. This is what happens is I apply a -0.1v offset to the core:"
What is interesting in the amount of TDC/PPT your processor is actually using. About 195.5W and 142A in this image. That is close to where I set my processor. 215W/140A/160A. After that, I found the temperature went up rapidly with really limited gains. My 5950X is at 70C at 215W/140A/160A not over 80C that you are seeing here, but that is likely due to the cooler difference. I have a full custom EK monoblock that also cools the VRMS. I actually had over 90% of the performance at 200W/125A/145A and my temps were in the low 60Cs.
I don't think there is any reason to let the CPU go over 200A on TDC or EDC. The cooling required just isn't there unless going sub ambient.
I just finished another build with a 5950X - nearly the same temps as my 5900X.
First thing i would check is that PBO is set to Auto (which means disabled).
And that every fancy Mobo crap like Game Boost / Core Boost or F*myTempsUpBoost setting (whatever that is called on ASUS Boards) is off.
Then check the Temps with the real Stock Settings... an EDC of 200A like in one of those screenshots is def. not default/stock
"hy even with a negative curve offset does the cpu pull 200A at 1.34v when using PBO? That is 168W vs 268W a whole 100W of power extra is drawn to achieve the same clock speeds by the boosting algorithm, this is massive."
Again, clock speeds by themselves aren't really indicative of anything. AVX workloads required vastly more amperage to execute at identical clocks vs other instruction sets. How does that clock speed hold up in manual in OCCT?
But as I recommended, you could take an incremental approach with the PPT/TDC/EDC and set your PBO values that way. That is what I did with my ASUS motherboard and I have had excellent results with the Ryzen 9 5950X.
I am 100% aware of this, however I am testing the same workload here so it is completely needless for the boosting algorithm to pull an additional 100W for the same work in this case.
"o pull an additional 100W for the same work in this case."
I think that is the way it has always worked? At least since matisse it has at any rate. The algorithm doesn't figure in instruction sets, only threading level. So if you get a 16 thread workload the processor will pull enough assuming that the most challenging instruction set . That can make it seem as though the processor is pulling too much vs manual for no gain in performance. But run something like OCCT small data or another heavy AVX workload set and then the manual settings just don't have the juice and fall away in performance.
Well you could be on to something there. Not sure why ASUS would set the EDC limit lower than the TDC, that doesn't really make any sense. The EDC is the max short term boost amperage the system will allow, and TDC is the max sustained. The board is telling the AMD algorithm that spikes on 200A are all it can handle, but do a sustained load at 255A?
I would make sure that the EDC is always higher than or equal to the TDC. Since the board has locked EDC at 200, bring TDC down to the same amount or below. I would just try 215/140/160 as I have set and see how your boost does with that.
I have been running 215/140/160 since you mentioned yesterday to see what the results are. unfortunately they are quite similar high temps although ever so slightly lower than previous and performance has taken a slight hit:
Single core I still don't reach 4.9GHz and temps are still ridiculous considering only a fraction of the cpu is being utilised and it still manages to reach the same temp as an all core load?
Thanks for the images.
I would set the scalar to 1X if it isn't already.
Also, under curve optimizer I would try to set Core1 and core 7 on CCD0 to -5, and all other cores at -10 (the numbering is often different in UEFI). Then set CCD1 to a flat -20. See if that helps at all.
I wonder if these ASUS boards are damaging themselves. Since they set the EDC to 200A, that would imply that only a short term spike at 200A is acceptable for the VRM design, but a sustained load up to 255A is fine? One of those limits clearly isn't correct. I may have avoided that issue on my ASUS board as I set the limits manually. I am also running X470, and these strange limits may be specific to 500 series boards. Additionally, by CPU block also cools the VRMs, so it is hard to say how universal the issue is.
pretty sure the best cores on both ccx's are -12 the 2nd best are -20 and the rest are -25. Can't confirm quite yet but its thereabouts. The scaler will be set to auto at the moment though, I have tried 1x in the past with seemingly no benefits.
Will confirm values later and try 1x scaler again and get back to you.
Ok so previous values were -12 on the best core per CCD -20 on the next best and -30 on all other cores.
Now I have -5 on Cores 1 and 7 on CCD0 / -10 on Cores 2 to 6 and 8 on CCD0 / -20 on all cores on CCD1 (9 to 16)
and the scaler on 1x as you suggested.
Still unfortunately seems roughly the same as the previous numbers.
Single core (somewhat better temps but still no 4.9GHz):
I've spotted something else odd.....
If the EDC is peak current and TDC is sustained current how is the EDC at 100% and TDC at 93-94% given the lower limit of 140 on TDC?
Damned! I was playing fine, never going over 85°C, totally stock and all of a suddencrash and reboot...
Error 19 Bus/Interconnect Error and 18 WHEA: Cache Hierarchy Error
Any idea on what can cause that?
Why did I move to AMD, why?????
No change on that side, still running RAM at 3800 (in place of 4400) 18 18 18 45 - 1.4V with FCLK at 1900. Did not touch curve as yet since after few hours in use, all cores in CCD1 reach 5025MHz vs 4825 for CCD2.
Note that when gaming (BF5) , freq stay at 4600MHz for all cores with temp now always lower than 80° (only very brief jump to 85°C when in menus between levels...)
I think you need to find a stable settings for your RAM with a FCLK 1800. Maybe this happening cause of the FCLK. Maybe you can try 4000MHz Ryzen DRAM Calculator but with FCLK 1800. Cause the cache error you mentioned makes me think about that.
My suggestion is you guys just start using CTR 2.0/2.1 and let it do its magic.
It really worked wonders for me. My idle temps went down to high 20 /low 30 from 50+.
Ingame my temps are between 50 to 65 depending on game.
And if u think it will circumcise ur performence just try it. The only thing u have to invest is a little time.
I really recomment it.
You can try the timings I posted at 3600/1800 1.35V. Honestly 3600/1800 CL16 will give similar performance to 3800/1900 CL18 and be easier on the controller.
Thanks! That will be my next try if it crashes again. So far, since I change the PSU idle setting to typical, no more problem, but might be to recent to jump to a conclusion...
It just means the EDC is occasionally maxing out. The EDC shouldn't bottleneck performance during a sustained workload however. It is strange that TDC and PPT are not maxed, but your temps and voltage are within tolerance as well. Seems like the boosting is being cut off prematurely.
I added my run in below.
You can see my results are similar. My clock speeds are a bit higher likely due to the fact that I have +100 to the core under auto overclocking in UEFI. My temps are also a bit lower despite using the full 140A, but they are really that much different. Try turning on +100 under the precision boost overdrive settings with the 1X scalar. I am curious if you system will continue to limit itself in some way.
Now, your single core results are harder to explain. Again, I would enable the +100 and see if it will boost higher. Below are my results.
You observe much less power draw and current, with a higher voltage which is exactly what I see. What is strange is that your temps on single threaded are pretty high. You can see I am only 4C cooler than your setup in multicore, but 9C cooler is single with identical settings. Not sure what to make of that. Your cooler seems to be doing well enough under multicore, which specific block do you have again?
I've been trying to work out this strange behaviour for months, exact block is the:
"EK-Quantum Velocity D-RGB - AMD Nickel + Acetal"
Full specs and images here:
I also think you maybe didn't understand the odd behaviours I pointed out.
EDC is peak current
TDC is Sustained current
My benchmarks on all core will run with a constant 100% EDC and constant 93-94% TDC
So 100% of 160A = 160A
and 93% if 140A = 130A
So it's telling me the CPU is simultaneously pulling both 130A and 160A. There are no fluctuations
Similar to yours, it is telling you that you are pulling both 140A and160A at the same time, this makes no sense.
Also with your +100MHz recommendation: