cancel
Showing results for 
Search instead for 
Did you mean: 

Processors

Highlighted
Journeyman III
Journeyman III

Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

I know it is a corner case, but the main reason for me to buy Ryzen platform was to support my (Cisco) networking studies. I need a lot of cores and a lot of RAM for this, let me explain my current situation:

Hardware: Ryzen 3900x, gigabyte x570 elite, 4x16G Gskill FlareX 2400MHz RAM (CL15). All stock, no overclocking, PBO disabled, latest BIOS installed (1.0.0.4)
Software: windows 10 home edition, every update installed, latest chipset drivers installed
Virtualization: every feature enabled in BIOS. Under win10, I use VMWare workstation player v15.

So I use a virtual appliance under VMW player, named EVE-NG (v2.0.3-105) community edition for network device virtualization (www.eve-ng.net). It is a free software built on Ubuntu.
Based on their guides, I successfully set up different Cisco network devices by installing their relevant images. Guides are here: https://www.eve-ng.net/index.php/documentation/howtos/

Now lets focus on the problematic router and OS version, namely: Cisco xrv9000 v6.5.1. Official info about this virtual appliance:
https://www.cisco.com/c/en/us/td/docs/routers/virtual-routers/xrv9k-65x/general/release/notes/b-rele...


Inside Eve-ng, this image is also running in a virtualized way, using Qemu v2.12. So this is nested virtualization scenario.
the problem is, that this appliance cannot boot up in 95% of the attempts, only about 5% of the attempts are successful.
eve-ng info:

    root@eve-ng:~# uname -a
    Linux eve-ng 4.20.17-eve-ng-ukms+ #2 SMP Wed Jun 5 08:18:06 CEST 2019 x86_64 x86_64 x86_64 GNU/Linux


Boot process stops after a random time (after 5 seconds or 30 seconds or 3 minutes) with messages like this:

  ################################################################################
    #                                                                              #
    #                  Welcome to the Cisco IOS XRv9k platform                     #
    #                                                                              #
    #    Please wait for Cisco IOS XR to start.                                    #
    #                                                                              #
    #    Copyright (c) 2014-2017 by Cisco Systems, Inc.                            #
    #                                                                              #
    ################################################################################


    Cisco IOS XR console     will start on the 1st serial port
    Cisco IOS XR aux console will start on the 2nd serial port
    Cisco Calvados console   will start on the 3rd serial port
    Cisco Calvados aux       will start on the 4th serial port

Text above shows normal boot process, then this happens:

    [   10.304380] BUG: unable to handle kernel paging request at ffffffff860449b1
    [   10.304380] IP: [<ffffffff860449b1>] kvm_unlock_kick+0x81/0x90
    [   10.304380] PGD 6a0f067 PUD 6a10063 PMD 60001e1
    [   10.304380] Oops: 0003 [#1] SMP
    [   10.304380] Modules linked in: tun bridge ip6table_filter ip6_tables iptable_filter ip_tables 80d
    [   10.304380] CPU: 1 PID: 4734 Comm: tee Tainted: G           O 3.14.23-WR7.0.0.2_standard #1
    [   10.304380] Hardware name: cisco Cisco IOS XRv 9000, BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu4
    [   10.304380] task: ffff88031c2d8110 ti: ffff8800ba5c0000 task.ti: ffff8800ba5c0000
    [   10.304380] RIP: 0010:[<ffffffff860449b1>]  [<ffffffff860449b1>] kvm_unlock_kick+0x81/0x90
    [   10.304380] RSP: 0018:ffff8800ba5c3d18  EFLAGS: 00010046
    [   10.304380] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000000
    [   10.304380] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff86f85a00
    [   10.304380] RBP: ffff8800ba5c3d30 R08: ffffffff86da4b00 R09: 0000000000000286
    [   10.304380] R10: 0000000000000000 R11: 0000000000000246 R12: ffffffff86f85a00
    [   10.304380] R13: 000000000000080c R14: ffff88031bc4104e R15: ffff88031c4ac000
    [   10.304380] FS:  00007fa68c1c7700(0000) GS:ffff88032dc80000(0000) knlGS:0000000000000000
    [   10.304380] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    [   10.304380] CR2: ffffffff860449b1 CR3: 0000000037a89000 CR4: 00000000001406e0
    [   10.304380] Stack:
    [   10.304380]  0000000000000286 ffff88031c1ff800 0000000000000286 ffff8800ba5c3d48
    [   10.304380]  ffffffff8658ee3a ffffffff86f85a00 ffff8800ba5c3d70 ffffffff863670ed
    [   10.304380]  000000000000004e 0000000000000000 0000000000000000 ffff8800ba5c3dc8
    [   10.304380] Call Trace:
    [   10.304380]  [<ffffffff8658ee3a>] _raw_spin_unlock_irqrestore+0x5a/0x70
    [   10.304380]  [<ffffffff863670ed>] uart_start+0x3d/0x50
    [   10.304380]  [<ffffffff86367b7b>] uart_write+0xeb/0x120
    [   10.304380]  [<ffffffff8634ccfd>] n_tty_write+0x1ed/0x540
    [   10.304380]  [<ffffffff8608c830>] ? wake_up_process+0x50/0x50
    [   10.304380]  [<ffffffff86349344>] tty_write+0x174/0x2c0
    [   10.304380]  [<ffffffff8634cb10>] ? process_echoes+0x70/0x70
    [   10.304380]  [<ffffffff86349525>] redirected_tty_write+0x95/0xa0
    [   10.304380]  [<ffffffff861a33ea>] vfs_write+0xba/0x1e0
    [   10.304380]  [<ffffffff861a3e06>] SyS_write+0x46/0xc0
    [   10.304380]  [<ffffffff86597e49>] system_call_fastpath+0x16/0x1b
    [   10.304380] Code: 37 b1 86 48 8d 04 0b 48 8b 38 4c 39 e7 75 cb 0f b7 40 08 66 44 39 e8 75 c1 48
    [   10.304380] RIP  [<ffffffff860449b1>] kvm_unlock_kick+0x81/0x90
    [   10.304380]  RSP <ffff8800ba5c3d18>
    [   10.304380] CR2: ffffffff860449b1
    [   10.304380] ---[ end trace d3e7f193a2285065 ]---

after this, either boot process tops, or a restart happens.


I contacted the eve-ng team about this (they are very helpful), and I got a response that they do not support AMD CPUs at all because the behavior of the AMD cpus are not predictable in nested virtualization so they cannot guarantee anything.
Additionally, in their forums someone stated that the virtualization capabilities of Ryzens are not as good as the Intels VT-x and VT-d this is why there are issues.
I don't know if it is true or not I am not a virtualization expert.


also, I asked about this issue in a Cisco forum, but there is nothing, but silence. Cisco also has their own non-free eve-ng like solution, named VIRL. I had the exact same issue under VIRL. xrv9k couldn't boot up most of the times.
This is why I didn't extend my subscription last year.


So it seems like this symptom is not the issue of the virtulaization software (vmware/quemu/VIRL/eve-ng), but it must be something close to the hardware (CPU) level.


I googled some of the error messages, but couldn't find any solution, just references to old kernels or CPU defects.


I hope someone in AMD or in the AMD community can hint some workaround/solution for this issue.

0 Kudos
13 Replies
Highlighted
Adept II
Adept II

Re: Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

I'm posting here, mostly, to show support. When I purchased my 3950X, I was aware that MS didn't support nested virtualization for AMD, but have been led to believe that the issue is that AMD has not worked with MS to enable the function/feature under the Windows environment. This issue was documented by MS back in 2016 (Nested Virtualization | Microsoft Docs ) and has not been remediated to date. I just tried to turn on nested virtualization under Hyper-V and this is still an issue under Windows 10 1909, build 18363.628.

I don't know, for sure, that this is the issue you're seeing with VMware Workstation and your VMs running there, but I suspect it is. There is an ongoing discussion here (https://community.amd.com/thread/247222 ) that is also related. The only answer, I believe, is ESXi, if you require nested virtualization from AMD.

Resolution requires AMD assistance to MS for the Windows environment...

Highlighted
Journeyman III
Journeyman III

Re: Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

thank you sir for your support. I will check your thread. I am afraid it is not MS's fault... I feel like it is something with the microcode / hardware itself. Maybe AMD-V is  not that "feature rich" as Intel's vt-x / vt-d? I don't know, just guessing...

I will build my setup using linux, so no MS will be involved. the problem is, it takes a ton of time... I don't have days to troubleshoot HW / SW issues... I would USE my expensive system because this is why I spent a lot of money...

I had the same problem with my previous computer, which was an 1800x. exactly the same...

0 Kudos
Highlighted
Journeyman III
Journeyman III

Re: Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

Thanks for the sharing bro. Your information is very import to me because I am also considering Intel or AMD. 

I read the document, it said AMD is not officially supported.

But I also found the article below. This fella shares his experience about running EVE-NG on Ryzen 7. He hasn't mentioned which model, I believe its Ryzen 1700. However, I believe Ryzen 1700 and Ryzen 3 are the same microarchitecture.

Shoaib Merchant on Twitter: "DC build is ready and functional in EVE-NG! #eveng #virtualization #ryz... 

Have you try ESXi?

I hope you can share the information to me.

Your advice is highly appreciated.

0 Kudos
Highlighted
Big Boss
Big Boss

Re: Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

Hyper-V and other hypervisors were not designed for nested use. While Hyper-V will refuse Windows 95 and Windows 98 guests, VirtualBox has no restrictions.

I have a very much experience with virtual machines from data centers where vast numbers of servers handle virtual workloads buy the thousands.

I use Hyper-V extensively in my shop as well for development purposes. I also use VirtualBox extensively.

0 Kudos
Highlighted
Adept II
Adept II

Re: Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

Nested Virtualization has been available since, at least, Windows Server 2016. It works just fine on Intel Core and above. I've even run ESXi nested on top of Server 2016+ using nested virtualization through a technique similar to the following article:

Installing ESXi on Hyper-V: Complete Walkthrough 

That said, the performance of ESXi was pretty miserable, though I never spent much time trying to tune the environment...

To the OP:

KVM may also be an option, though I haven't researched it thoroughly enough to recommend it...

0 Kudos
Highlighted
Big Boss
Big Boss

Re: Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

Remote desktop is an option for Windows 98 and above. Windows 98 calls it terminal services but XP Pro and above call it remote desktop. Home versions do not support remote desktop.

KVM goes back to before remote desktop was mature enough for widespread use.

0 Kudos
Highlighted
Journeyman III
Journeyman III

Re: Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

I have another system, a 1800x with Ubuntu 18.04 LTS with all updates installed and I use KVM.

Installed all software, and... same symptom.

It is not windows or linux or VMWare or KVM. It must be something with the Ryzens...

0 Kudos
Highlighted
Elite
Elite

Re: Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

hello,

 

are your issue be the same as this related in this post ?

 

AMD nested virtualization? · Issue #1276 · MicrosoftDocs/Virtualization-Documentation · GitHub 

0 Kudos
Highlighted
Journeyman III
Journeyman III

Re: Ryzen 3900x nested virtualization issue (vmware, ubuntu, qemu)

Hello,

Not entirely the same issue, but seems to be kind of similar.

In my case, not matter what the host OS is (win10 or Ubuntu), no matter what the Hypervisor is (VMWare player or KVM), the symptom is the same. So no Hyper-V in my case. The root cause my be the same... the only common point is the CPU: Ryzen.

0 Kudos