cancel
Showing results for 
Search instead for 
Did you mean: 

Processors

NoxAstrum
Journeyman III

Voltage and temperature spikes with ASUS X670E and 7950X3D

This post is intended for both AMD employees to see (if that's possible), and to warn end users who may not be aware.

Since the ASUS/AMD SOC voltage issue causing CPUs to explode, I've been monitoring my voltages like a hawk with Hwinfo. I have set alerts for SOC and VDD voltages, as well as various temperatures.

The problem is that I sometimes see voltage and temperature spikes. The three values involved are VDD and SOC voltages and CPU Die (Average). Here's what happens and some observations I've made:

  •     All three values register a short lived spike at the exact same time (perhaps a single sample? Certainly a second or less in duration).
  •      The SOC voltage can reach anywhere from 1.5 V to over 2 V!
  •     The average die temp has read as high as 110 degrees celsius, but no other temperature spikes at the same time, not a single core or CCD, they all remain normal  (around 40 degrees at idle). I wonder if there's a single sensor somewhere outside the die that's contributing to the high average.
  •     It seems that a lot of the time, these spikes occur when I click something after a period of inactivity. For example, I get up to go to the bathroom, and when I come back and move the mouse or click on something, it can happen. This isn't guaranteed though, sometimes it happens while I'm using the machine continuously.
  • This is fairly infrequent (once a day or less), or at least has been until recently.
  • I have tried turning off DOCP, PBO and CPB in the BIOS. At first, I thought it stopped the problem as it didn't happen for a couple of days. I didn't intend to keep them turned  off, they're features I paid good money for after all, but I thought it would help prevent damage while ASUS found a solution.
  • I updated my chipset drivers from AMD today (after turning off PBO etc and not seeing the spike for a few days) because I noticed I didn't have the latest version installed.  Since then, it's happened four or five times (in only a few hours), dramatically more frequently than normal. I can only assume this is causal and not just correlation.
  • I have a support ticket open with ASUS, but I have little faith they'll spend the resources to investigate properly, let alone come up with a solution.
  • This doesn't seem to have anything to do with CPU load. Playing a game or running benchmarks (Cinebench R23 for example) doesn't seem to make a difference.
  • One ASUS tech I chatted with swears that this reading is from the 'power plan' on the motherboard, before the CPU socket, and the CPU cannot be exposed to these voltages. I'm not convinced. Since I don't know the details of sensor locations and circuit construction, I have no idea if this is true or not.
  • I updated my BIOS to the latest (1416) as soon as it was released, which is stated by ASUS to limit SOC to 1.3 V. My memory and CPU are both on the QVL for my board (ROG STRIX X670E-E GAMING). I have noticed that the most common SOC voltage spikes are 1.563 and 1.565 V. Each has happened several times.

I have a few theories as to where the problem might lie, they fall in two basic categories: a readout error where the data is incorrect and the values aren't actually spiking, or a        control error where they are spiking because the voltage protection circuitry/algorithms isn't functioning properly. These are just guesses based on my limited experience (I have a background in electrical and networking technology, but I'm a mechanical technician by trade, so my knowledge is limited)

  • I wonder if the issue is simply the way HWinfo sees or interprets the data and there is no actual spike (I consider this less likely than the other options).
  • Could it be that the hardware is reporting the data incorrectly, perhaps a register error or a flaw in the way the circuit works?
  • Could the chipset drivers or BIOS/AGESA be involved with reporting incorrect data?
  • Alternatively, I wonder if these spikes are real values. Perhaps whatever algorithm decides on the voltage is requesting the wrong values, or the hardware/drivers/BIOS is failing to regulate it properly.
  • Maybe the VRM circuits themselves are producing higher voltages than what's requested, or some component can't respond fast enough to prevent them.
  • Considering the issue with exploding CPUs, my first suspicion is either the AMD AGESA libraries or the ASUS BIOS, or the chipset drivers are the culprit and that this is not normal or safe.
  • I also wonder if this could be normal behaviour for these sorts of circuits. Maybe the voltage provided by the motherboard spikes occasionally as part of changing demand, and the spikes are quickly damped without stressing any components.

I've been exploring online. There's a reddit thread where one person claims that these high voltages would instantly destroy the CPU and it must be an error. Others believe this is a 'double' reading, apparently they've been seeing the values as exactly double the baseline. I am not the only one having this problem.


In one exchange, ASUS asked me what the problem was, SOC was limited to 1.3 V. They also told me to turn off PBO and use a liquid cooler. They obviously hadn't read the details I provided twice already, or the myriad screenshots of my graphs. There are several people involved, asking for the same details and it seems internal communication is less than ideal. I have a custom liquid loop and even under max load my hottest temp rarely exceeds 70 degrees.

I'm hoping that AMD will see this and investigate, and that the solution is as straightforward as some updated firmware or drivers. This really needs the expertise of an electrical engineer I suspect, and one who's familiar with the product. Barring that, I hope this will at least raise awareness and prompt others to start monitoring their voltages. Here's a screenshot of one of the spikes.

06022023_1226AM.PNG

0 Likes
8 Replies
johnnyenglish
Grandmaster

Even with the update, some users are still reporting occasional spikes.

I would SET SOC and VDDIO to 1,2v or less manually.

I also advise to use vCore negative Offset plus Curve Optimizer. The last also bumps performance. 

Manual SoC voltage vídeo  

The Englishman
0 Likes
Koyote7667
Challenger

Here is a thing. Your soc current max is at 21.9A your cpu core current is at 22.8 lol... your cpuedc is at 47 edit AMPS dude lol,.. AMPS, your cpu packackage power, is 41 watts max lol, which is fine (aka, you CANT be puling **bleep** near 50 amps, using 41 watts)....  If you turned more "red things on" your entire screen would be lit up) That, or, you clicked on some silly stuff in your bios (hell, i doubt they even let you do that, or your running the wrong one for our board, (i have the same, and cant do that) or you, have let it do that. Its that, or,....... its bugged. No idea what you have going on, nor, what you have done. I turned on pbo, use curve optimizer, and this rig runs like a **bleep** top. 

Your vddmisc is even DOUBLE,.... 2.1

 

You sir.. are bugged, or you did something the bios doesnt let you, unless, you said "i want to". Look at all the other "measurements" and and be more worried about those lol. I highly doubt your pulling those numbers, but, still seeing 28 ish per core. No way. Even a spike.   Run the thing dude. Relax, and run it. Play your games or whatever. 

Asus x670e-e with a 7800x3d here running ddr5 at 6200 all day here. 

MarceloM
Journeyman III

Hi. 
I am having the exact same issue. 7950x with Asus Strix x670e. No EXPO.

Bios 1416.

There are other anomalies like 0ºC readings in Core temp and L3 temp, and the minimum voltage of CPU VDDCR_VDD VOLTAGE (SVI3 TFN), UNDER 1v. 

SPYKE - Screenshot 2023-06-05 215010.png

2023 06JUNHO 05 3 15h24 ____ .png

Interesting. I'm not seeing any of the low readings. I would suggest two things: try to find a pattern. See if you can narrow down when it happens, and submit a support ticket to ASUS. The more people they get this issue from, the more likely they are to devote resources to it. 

Let's say it's not actually a voltage spike, but is some sort of data error, it could be from the hardware doing the reporting, the chipset drivers, windows, HWinfo and any other link in the chain from voltage to screen.

I have a ticket with ASUS and they say they're trying to duplicate the issue. I figure they're only going to expend so much effort before they wash their hands of it. I'm going to try and figure out how to measure the actual voltage with a multimeter on the board, to see if it's really spiking. 

It seems that it happens when I go away from the PC for a few minutes (10-30) and then return. The sudden input seems to cause the voltage spikes. It's almost as if a device is being put to sleep, and when it wakes with activity, there's a problem keeping the voltage in check.

I've turned off all power saving features in windows to see if that makes a difference. So far, it hasn't happened again, I'll wait a few days and if I don't see it, I'll start turning them back on one at a time.

Even if it is Windows power settings, it's shouldn't be happening, and either AMD or ASUS need to fix it.

One more thing: can you verify that your low temp anomalies are happening at the same time as the voltage spikes? That could be very important info.

Angeluk
Challenger

IF spikes are in exactly the same time, I would say it is something to do with the PSU. Did you try another PSU?

0 Likes
Koyote7667
Challenger

The sensor hardware/software, and in between is just bugged guys.  Thats it. Nothing more.  Obviously getting a zero temp, ..... is, well, not the gear, but the reading, like all of your "info" coming in.  

0 Likes

It will crash the PC, I don’t think it’s the sensor

0 Likes
BIZZnice
Journeyman III

Okay, I’m getting this too with HWinfo.  With the latest bios from the end of September.  My best OC so far will give a spike to 110c on the die when no core was near that.  It is the SSE benchmarks that is triggering this and it is a very quick spike.  I have a liquid cooling 2x 360mm radiators on my loop.  Regardless of cooler k think this spike will happen.  I removed my OC setup, completely stock bios setup will spike past 95c the thermal limit.  PBO enhancement 90c setting will spike to 95c. So I’m been able to recreate exactly where it will occur and it’s whenever I run SSE Passmark or SSE on OCCT.  No other test or benchmark spikes like this.  It is always when the test begins as well.  I have had soft crashes which reboot but with my max OC setup and curve optimized it will straight up shut down my pc.  It feels like a very high voltage spike that is almost at the top of the voltage curve, which goes beyond the thermal limit.  It’s very interesting though that no core actually shows the same max temp as the total die.

0 Likes