I know it is a corner case, but the main reason for me to buy Ryzen platform was to support my (Cisco) networking studies. I need a lot of cores and a lot of RAM for this, let me explain my current situation:
Hardware: Ryzen 3900x, gigabyte x570 elite, 4x16G Gskill FlareX 2400MHz RAM (CL15). All stock, no overclocking, PBO disabled, latest BIOS installed (126.96.36.199)
Software: windows 10 home edition, every update installed, latest chipset drivers installed
Virtualization: every feature enabled in BIOS. Under win10, I use VMWare workstation player v15.
So I use a virtual appliance under VMW player, named EVE-NG (v2.0.3-105) community edition for network device virtualization (www.eve-ng.net). It is a free software built on Ubuntu.
Based on their guides, I successfully set up different Cisco network devices by installing their relevant images. Guides are here: https://www.eve-ng.net/index.php/documentation/howtos/
Now lets focus on the problematic router and OS version, namely: Cisco xrv9000 v6.5.1. Official info about this virtual appliance:
Inside Eve-ng, this image is also running in a virtualized way, using Qemu v2.12. So this is nested virtualization scenario.
the problem is, that this appliance cannot boot up in 95% of the attempts, only about 5% of the attempts are successful.
root@eve-ng:~# uname -a
Linux eve-ng 4.20.17-eve-ng-ukms+ #2 SMP Wed Jun 5 08:18:06 CEST 2019 x86_64 x86_64 x86_64 GNU/Linux
Boot process stops after a random time (after 5 seconds or 30 seconds or 3 minutes) with messages like this:
# Welcome to the Cisco IOS XRv9k platform #
# Please wait for Cisco IOS XR to start. #
# Copyright (c) 2014-2017 by Cisco Systems, Inc. #
Cisco IOS XR console will start on the 1st serial port
Cisco IOS XR aux console will start on the 2nd serial port
Cisco Calvados console will start on the 3rd serial port
Cisco Calvados aux will start on the 4th serial port
Text above shows normal boot process, then this happens:
[ 10.304380] BUG: unable to handle kernel paging request at ffffffff860449b1
[ 10.304380] IP: [<ffffffff860449b1>] kvm_unlock_kick+0x81/0x90
[ 10.304380] PGD 6a0f067 PUD 6a10063 PMD 60001e1
[ 10.304380] Oops: 0003 [#1] SMP
[ 10.304380] Modules linked in: tun bridge ip6table_filter ip6_tables iptable_filter ip_tables 80d
[ 10.304380] CPU: 1 PID: 4734 Comm: tee Tainted: G O 3.14.23-WR188.8.131.52_standard #1
[ 10.304380] Hardware name: cisco Cisco IOS XRv 9000, BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu4
[ 10.304380] task: ffff88031c2d8110 ti: ffff8800ba5c0000 task.ti: ffff8800ba5c0000
[ 10.304380] RIP: 0010:[<ffffffff860449b1>] [<ffffffff860449b1>] kvm_unlock_kick+0x81/0x90
[ 10.304380] RSP: 0018:ffff8800ba5c3d18 EFLAGS: 00010046
[ 10.304380] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000000
[ 10.304380] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff86f85a00
[ 10.304380] RBP: ffff8800ba5c3d30 R08: ffffffff86da4b00 R09: 0000000000000286
[ 10.304380] R10: 0000000000000000 R11: 0000000000000246 R12: ffffffff86f85a00
[ 10.304380] R13: 000000000000080c R14: ffff88031bc4104e R15: ffff88031c4ac000
[ 10.304380] FS: 00007fa68c1c7700(0000) GS:ffff88032dc80000(0000) knlGS:0000000000000000
[ 10.304380] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 10.304380] CR2: ffffffff860449b1 CR3: 0000000037a89000 CR4: 00000000001406e0
[ 10.304380] Stack:
[ 10.304380] 0000000000000286 ffff88031c1ff800 0000000000000286 ffff8800ba5c3d48
[ 10.304380] ffffffff8658ee3a ffffffff86f85a00 ffff8800ba5c3d70 ffffffff863670ed
[ 10.304380] 000000000000004e 0000000000000000 0000000000000000 ffff8800ba5c3dc8
[ 10.304380] Call Trace:
[ 10.304380] [<ffffffff8658ee3a>] _raw_spin_unlock_irqrestore+0x5a/0x70
[ 10.304380] [<ffffffff863670ed>] uart_start+0x3d/0x50
[ 10.304380] [<ffffffff86367b7b>] uart_write+0xeb/0x120
[ 10.304380] [<ffffffff8634ccfd>] n_tty_write+0x1ed/0x540
[ 10.304380] [<ffffffff8608c830>] ? wake_up_process+0x50/0x50
[ 10.304380] [<ffffffff86349344>] tty_write+0x174/0x2c0
[ 10.304380] [<ffffffff8634cb10>] ? process_echoes+0x70/0x70
[ 10.304380] [<ffffffff86349525>] redirected_tty_write+0x95/0xa0
[ 10.304380] [<ffffffff861a33ea>] vfs_write+0xba/0x1e0
[ 10.304380] [<ffffffff861a3e06>] SyS_write+0x46/0xc0
[ 10.304380] [<ffffffff86597e49>] system_call_fastpath+0x16/0x1b
[ 10.304380] Code: 37 b1 86 48 8d 04 0b 48 8b 38 4c 39 e7 75 cb 0f b7 40 08 66 44 39 e8 75 c1 48
[ 10.304380] RIP [<ffffffff860449b1>] kvm_unlock_kick+0x81/0x90
[ 10.304380] RSP <ffff8800ba5c3d18>
[ 10.304380] CR2: ffffffff860449b1
[ 10.304380] ---[ end trace d3e7f193a2285065 ]---
after this, either boot process tops, or a restart happens.
I contacted the eve-ng team about this (they are very helpful), and I got a response that they do not support AMD CPUs at all because the behavior of the AMD cpus are not predictable in nested virtualization so they cannot guarantee anything.
Additionally, in their forums someone stated that the virtualization capabilities of Ryzens are not as good as the Intels VT-x and VT-d this is why there are issues.
I don't know if it is true or not I am not a virtualization expert.
also, I asked about this issue in a Cisco forum, but there is nothing, but silence. Cisco also has their own non-free eve-ng like solution, named VIRL. I had the exact same issue under VIRL. xrv9k couldn't boot up most of the times.
This is why I didn't extend my subscription last year.
So it seems like this symptom is not the issue of the virtulaization software (vmware/quemu/VIRL/eve-ng), but it must be something close to the hardware (CPU) level.
I googled some of the error messages, but couldn't find any solution, just references to old kernels or CPU defects.
I hope someone in AMD or in the AMD community can hint some workaround/solution for this issue.
I'm posting here, mostly, to show support. When I purchased my 3950X, I was aware that MS didn't support nested virtualization for AMD, but have been led to believe that the issue is that AMD has not worked with MS to enable the function/feature under the Windows environment. This issue was documented by MS back in 2016 (Nested Virtualization | Microsoft Docs ) and has not been remediated to date. I just tried to turn on nested virtualization under Hyper-V and this is still an issue under Windows 10 1909, build 18363.628.
I don't know, for sure, that this is the issue you're seeing with VMware Workstation and your VMs running there, but I suspect it is. There is an ongoing discussion here (https://community.amd.com/thread/247222 ) that is also related. The only answer, I believe, is ESXi, if you require nested virtualization from AMD.
Resolution requires AMD assistance to MS for the Windows environment...
thank you sir for your support. I will check your thread. I am afraid it is not MS's fault... I feel like it is something with the microcode / hardware itself. Maybe AMD-V is not that "feature rich" as Intel's vt-x / vt-d? I don't know, just guessing...
I will build my setup using linux, so no MS will be involved. the problem is, it takes a ton of time... I don't have days to troubleshoot HW / SW issues... I would USE my expensive system because this is why I spent a lot of money...
I had the same problem with my previous computer, which was an 1800x. exactly the same...
Thanks for the sharing bro. Your information is very import to me because I am also considering Intel or AMD.
I read the document, it said AMD is not officially supported.
But I also found the article below. This fella shares his experience about running EVE-NG on Ryzen 7. He hasn't mentioned which model, I believe its Ryzen 1700. However, I believe Ryzen 1700 and Ryzen 3 are the same microarchitecture.
Have you try ESXi?
I hope you can share the information to me.
Your advice is highly appreciated.
Hyper-V and other hypervisors were not designed for nested use. While Hyper-V will refuse Windows 95 and Windows 98 guests, VirtualBox has no restrictions.
I have a very much experience with virtual machines from data centers where vast numbers of servers handle virtual workloads buy the thousands.
I use Hyper-V extensively in my shop as well for development purposes. I also use VirtualBox extensively.
Nested Virtualization has been available since, at least, Windows Server 2016. It works just fine on Intel Core and above. I've even run ESXi nested on top of Server 2016+ using nested virtualization through a technique similar to the following article:
That said, the performance of ESXi was pretty miserable, though I never spent much time trying to tune the environment...
To the OP:
KVM may also be an option, though I haven't researched it thoroughly enough to recommend it...
Remote desktop is an option for Windows 98 and above. Windows 98 calls it terminal services but XP Pro and above call it remote desktop. Home versions do not support remote desktop.
KVM goes back to before remote desktop was mature enough for widespread use.
I have another system, a 1800x with Ubuntu 18.04 LTS with all updates installed and I use KVM.
Installed all software, and... same symptom.
It is not windows or linux or VMWare or KVM. It must be something with the Ryzens...
Not entirely the same issue, but seems to be kind of similar.
In my case, not matter what the host OS is (win10 or Ubuntu), no matter what the Hypervisor is (VMWare player or KVM), the symptom is the same. So no Hyper-V in my case. The root cause my be the same... the only common point is the CPU: Ryzen.
EVE-NG site has now updated it's system requirement page:
AMD Ryzen 3900 series has been added to the supported list. Can any one try out the latest EVE-NG v2.0.3-110 and see if the qemu images run without any issue.?
I have run exatly into the same issue.
Can't use nested virtualisation.
AMD Ryzen 9 3900X
Gigabyte X570 AORUS
Does anyone know how to resolve this?
Appreciate any advice.
Something with the Cisco image dependent on Intel features when its running nested VM
The Solution I found (I use KVM) is to add following tage in Qemu options
I use that with my Eve-ng lab on 3990x, has been working great since then. Hopefully this fix is good for you all temporarly until Cisco fixes there nested VM configuration not to be Intel dependent.
(FYI, its not going to be as fast as hardware virtualization)
First, For Cisco XRv Router (version 6.1.3 or below), juste make sure you name your image hda.qcow2. Or Eve-ng will wive you that error message : => Unable to access "/dev/hd0: (2)
And for Cisco XRv 9000 Router, make sure your image is named virtioa.qcow2. If you are not sure which version, try both name.
Of course change permission always => /opt/unetlab/wrappers/unl_wrapper -a fixpermissions
With AMD Rizen 7, 3700X, and motherboard X570 Aorus, that works with IOS-XR (XRV) 5.3.2. Running EVE-NG PRO on ESX 6.5. The version XRV 6.0.1 and above, are not working for some reason
Was not able to create a new image myself "virtioa.qcow2" with iso XRV downloaded from cisco, he start, but then after 6-7 min, he says ERROR : only one CPU detected, the system cannot boot on single CPU platform
For more testing, have bought INTEL CPU, and was able to boot all image XRV qcow2 without any issue, so intel is best for sure. But poor performance. Intel 7 only 8 core, versus 16 core with Rizen 7. So have switch back to AMD, and after some research, discover works with that version 5.3.2
try VirtualBox 5.2.44
should run from Ubuntu/Linux or Win10