cancel
Showing results for 
Search instead for 
Did you mean: 

Processors

usererror
Adept I

Virtualization Enabled in Bios causing random crashes Windows, Linux, Two separate machines

Virtualization Enabled in Bios causing random crashes Windows, Linux, Two separate machines owned and operated by two separate people in two separate parts of the world on different chipsets.

Machine 1
3600X
Asus X370 (top tier board)


Machine 2

3600
MSI X570 (mid tier board)

Happens in Windows, Linux, Unix (BSD), etc.  Happens across multiple BIOS versions (its been a multi-year standing issue).

I am scrambling to put together a second machine just to run a single appliance because of this.
Effectively, Virtualization being bugged is costing me more than this CPU did.

It didn't happen on the former R5 1600 (that I noticed).

Both of us this week turned on SVM (virtualization) to try to get some things done independently, and both of us unprompted mentioned trying to and it causing such great instability that it just wasn't viable, even for a short term task.

This is the kind of thing that makes me say: Either ya gotta replace the CPU with one that isn't bugged (this is a major manufacturing defect), or I gotta ditch AMD despite my frustrations with Intel.  This type of bug costs me way too much.

If anyone has a fix for it, let me know, because as of now I'm fed up and my only solution is to say to AMD what Linus Torvalds once said to nVidia.

Everyone I know is on AMD, because I told them it was better.  This makes me look bad.

I don't know if its a microcode fix, a Bios fix, or a hardware fix (replacement), but whatever it is, it needs to happen.

AMD, ya gotta make this one right, right away.  It simply can't stand when so many people have turned to you for affordable workstation machines (relative to what was available prior to first gen Ryzen).

0 Likes
8 Replies
MADZyren
Paragon

EDIT: Clicked a wrong thingy by mistake and made this new or something. Sorry.

I have 3800X and Asus X570 motherboard. Virtualization is enabled and I use it. I have not encountered issues.

Do you have the latest BIOS installed? When and how does this instability present itself? Everywhere, in some specific software? Doing something specific?

Pretty much everything in computer affect everything else too. Depending on situation, I would likely try with another software, lower memory speed, check temperatures, make sure PSU's are powerfull enough and not too old as virtualization can stress system more. Running an all-core clockspeed instead of PBO or something else might be a good idea. Make sure baseclock is 100Mhz.

I think the problem lies somewhere else than CPU or BIOS (unless yours is very old) and from what I've seen, it usually works. 

EDIT2: From what I found with google, it works with most people, but someone mentioned disabling IOMMU or using a fixed clockspeed has helped - I find the latter a bit strange, but in general it can be more stable if there are any stability issues I suppose. Also there is a thread on this board from 2018 where someone mentioned you shouldn't have Ryzen master if you have virtualization enabled, but this can be on outdated thing. More interestingly, on that this thread he apparently got his machine working by reinstalling operating system while virtualization is enabled. EDIT3: And forgot a link to that thread https://community.amd.com/t5/processors/ryzen-master-and-virtualization-cause-bsod/td-p/96928 

So, running multiple operating systems.  Not a software problem.  Spent maybe 100 hours chasing the crashes and lockups.
Disabling IOMMU is not a fix.  Its creating another, equally problematic problem that chops off half of the functionality of virtualization.

I appreciate it though - I have looked at those and tried them, but they didn't fix it for me.

I am not running *the latest* bios because others have, it didn't fix the problem, and it created some new problems for them (I am on X370, so the benefit to it is 5000 series support, which doesn't do anything for me at the moment).

The all core clockspeed thing, I may have to give that another try.

Powersupply is overspec'd by 4x based on max actual system utilization, and of high quality.  And I don't let anyone I know get a cheap power supply.  Some are running their ram at 2133, most at 3200 or 3600, all with the same problem.
And yeah, I agree with the baseclock thing.  I never deviate on that - thats just begging for issues I don't want to chase.

I don't run ryzen master, or any specialty things for specific pieces of hardware aside from a bit of stuff for nVidia GPUs, and when necessary a USB3 driver (I don't think thats necessary anymore, to manually add that, but it was at one point).

And yeah the instability is just, locking up, or outright crashing, at random, no pattern that I can discern.

0 Likes

Can it be related to idle mode voltage or to transient response?  
There was some people that reported that their CPU become unstable at idle because of incorrect voltage regulation. Can you try voltage offset in BIOS, to around +0.030 or +0.060 if it is still safe in your range under load (around 1.4-1.5V max). 

 

Can you get some logs from UNIX based system (as they tend to be more open for logging) that can relate to issue? Like Kernel panic? Or incorrect command processing? 

Had you tried to stress test CPU's by themselves with something really precise, like Y-Cruncher or Prime95? 

0 Likes

Yes, friends don't let friends buy cheap PSU's

I would probably disable hyperthreading for testing.

Another thing is, when 3000-series was published and compared to 2000-series, it won 2000-series CPU's hands down in just about everything, except there was something... Can't quite recall what it was, but I think it had something to do with memory performance in specific tasks and it was assumed it would not matter to most consumers, but could affect - unless I remember wrong, virtualization - so something in architecture changed. Why I mention this, is if you did not do a clean reinstall of operating system after switching CPU (or CPU+MB), but simply moved the system storage driver to new machine, that could be it.

Also have you tried giving virtual machine radically less resources, like CPU cores, memory aso?

Also I would choose fixed drive size instead of dynamic when there are issues.

It has L3 cache architectural issue.

Basically 32MB (or how much Zen 2 have) are split between halves of CCD (as if each CCD is split by 2) making 3600 behave as 2x 3 core CCD, 3700 as 2x 4 core CCD, and 3950X as 4x 4 core CCD. 

That means if task switches core frequently it can hit L3 cache switch between 2 cores in same CCD, acquiring additional latency and following issues (like frame lag in games). 

It looks like this

1-----3-----5-----7

-------16 MB------

-------16 MB------

0-----2-----4-----6

 

0 Likes

Ah, same stuff for 5600/5800 and 7600/7800 CPU's
Infinity Fabric bus from CCD to IOD is different size for read and write, as write bus is split by half for each CCD, while read bus is full size for each CCD.
Making write speed bit more than half from read speed. Like i have 55 GB/s read, and 30.5 GB/s write speed on my 5600X. Not that it actually affects performance much, as there is almost no cases you might need 55 GB/s write speeds without needing higher tier CPU's with 2 CCD's, that have both halves of write bus used up

0 Likes

I don't remember what it was, but some reviewer mentioned this affecting performance in some are, which I think had to do with virtualization or something like it. Think it was Level1Techs... Maybe.