cancel
Showing results for 
Search instead for 
Did you mean: 

Processors

aalmutairi
Journeyman III

MPI detects half of the physical cores on Ryzen Threadripper 3990x

I am using AMD threadripper 3990x in my PC running Windows 10 Pro (v.10.0.19041.329). Initially I was interested in using some simulation software that utilizes MPI. Hence, I decided to resort to WSL2 and Ubuntu linux (v.20.04). There I noticed the following when use the command lscpu:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 48 bits physical, 48 bits virtual
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 2
Core(s) per socket: 32
Socket(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 49
Model name: AMD Ryzen Threadripper 3990X 64-Core Processor
.
.
.

As you can see, it only detected 32 physical cores (64 logical ones). I initially though it was just a WSL2 issue so I tried hyper-V but to no avail. So I assumed it is an issue of virtualization since Windows correctly detect 64 physical cores (128 logical ones). However, I tried to run mpi directly on Windows (knowing that it is not the best). The result was the same, even on native Windows, MPI (with provided by openmpi or MPICH) always detects 32 cores. I tried asking through the Microsoft WSL2 GitHub page but no one seems to know how to fix it. I tried contacting the support on AMD and they recommended asking here. Has anyone experienced something similar? and are there any fixes I can test? 

Thank you

0 Likes
51 Replies
misterj
Big Boss

aalmutairi, I know little about Linux but will see if I can help with W10 and Hyper-V.  Please correct the formatting in you initial post that is causing the slid bar.  Please post a screenshot of Ryzen Master (RM) - simply drag-n-drop the image into your reply.  Also please post your HW parts.  Here is my Hyper-V, showing max cores:

pastedImage_1.jpg

Thanks and enjoy, John.

0 Likes
misterj
Big Boss

aalmutairi, please post a screenshot of Ryzen Master (R) - simply drag-n-drop the image into your reply.  Also post a screenshot of the W10 Task Manager as:

pastedImage_1.jpg

This one is from an AnandTech review of the 3990X.  I have a 3970X which only shows one Socket - 32 cores.  Thanks and enjoy, John.

0 Likes
aalmutairi
Journeyman III

Dear misterj, Unfortunately, I wasn't able to get Ryzen Master to work even after the VBS is disabled. However, here is an image of the task manager when I started am MPI parallel job on my virtual machine. 

pastedImage_1.png

it is still says on socket.

Here I used128 virtual cores. 

0 Likes

Thanks much, aalmutairi.  Here is a screenshot of my BIOS for disabling SVM (Secure Virtual Machine):

pastedImage_1.bmp

Your screenshot says virtualization is Enabled - maybe this was for your MPI test.  Are you still running W10 v.10.0.19041.329?  I would like to see the actual error message you are getting from RM installation.  Also please get SVM disabled, hopefully get RM installed and run R20 version of Cinebench for all cores.  If you do get RM installed, post a screenshot of RM while running Cinebench (CB) - no screenshot needed from CB, but please post another of the Task Manager as above.  It does look like your system is running all cores but half full blast and the other 64 barely.  I did a little research on MPI and found many user controls for managing cores/threads/etc.  Are all these all set properly?  Does MPI have a forum that may help us?  I've seen another screenshot like yours, so MS may have changed the number of sockets, since they know there is really just one.  Your SS also shows the L3 cache size and the other does not.  It may have been some early release W10.  Thanks much and enjoy, John.

0 Likes

Yes I am still using the version mentioned above of W10 Pro. I tried few things to fix the RM issue but it turned out I didn't need to worry about the BIOS. Disabling hyper-v and WSL and their dependents did it. I ran CB and following are the results 

 pastedImage_1.png

pastedImage_2.png

as you can see, I just captured the minimal RM because I was not sure what you needed exactly. Threaded job was never the issue, it is an mpi issue. For virtualization, not all cores are shared with the VM and this is what causes MPI to detect half of the cores. and when I used the Microsoft version, it fails too but not sure why (but again MPI was not designed for Windows. 

0 Likes

DISCLAMRE: Doing this fixed the RM but breaks W10. Now, Ethernet fails if I tried to enable virtualization  

0 Likes

Thanks, aalmutairi.  Sorry about your Ethernet.  If you want me to help with it I need to know about the system and the exact error message(s).  Disabling SVM killed your Ethernet when enabled?  I use Ethernet with and without SVM.  No need now, but I like to look at a complete screenshot of RM - lots of information to consider.  Since CB works fine (please post your score), I suspect you have an application problem.  Please let me know about a forum.  Enjoy, John.

0 Likes

Disabling hyper-V and WSL2 killed my Ethernet. Not sure why. 

0 Likes

aalmutairi, please tell me how you disable hyper-V and WSL2?  Enjoy, John.

0 Likes

Through "Turn Microsoft Features on or off", No need to worry about it, I am asking in the Microsoft community. But I am still interested in a fix for my MPI issue. 

0 Likes

aalmutairi, I Enable Hyper-V during my W10 installation with DISM and never disable it.  I simply Enable/Disable SVM in BIOS as I need RM or VM.  Thanks and enjoy, John.

0 Likes

Going back to the main issue, you mentioned that you use hyper V, what operating system are running on VM? Can you run an MPI test there if you don't mind?

0 Likes

aalmutairi,  have you thought about MS-MPI?  Enjoy, John.

EDIT:  Have you set the Hyper-V NUMA settings?

pastedImage_1.jpg

0 Likes

I can't use MS MPI since the software I need only runs on linux. I played with the NUMA but it changed nothing. What operating system are you running on Hyper V?

0 Likes

aalmutairi, I need to look into NUMA more.  I starting to suspect NUMA mode is not supported under W10 - maybe only Server.  I am running Windows 10 Professional x64 as a guest.  Enjoy, John.

0 Likes

aalmutairi, Hyper-V requires Dynamic Memory disabled to use NUMA.  See here:

pastedImage_1.jpg

Notice 8 NUMA nodes.  May well not help, but worth a try.  I still would like to know your Cinebench score.  Thanks and enjoy, John.

0 Likes

Hi John, sorry for the late response. I am planning to reset my W10 today to fix the ethernet problem (so far I have not found any solution for that issue). I guess your NUMA fix can be tested easily from your side if you do not mind. If you use a single NUMA and log in into the virtual machine, do you lose any cores? If so, then it is a NUMA problem; otherwise, we need to figure out the actual source of the issue. 

0 Likes

Great idea, aalmutairi, I will give it a try and report back.  Enjoy, John.

0 Likes

aalmutairi, remember I only have a 32 core 3970X.  Results:

pastedImage_1.jpg

My CB score was 15000 on VM and 17000 on real system.  Please run this same test on your system in a W10 Guest.  Thanks and enjoy, John.

EDIT: I ran CB in UMA and got 14250 score.

EDIT: Please see this about Linux and Ryzen.

0 Likes

misterj‌, sorry for the delay. I couldn't get myself to go through the reset and reinstalling everything again. So I looked into the NUMA test and the also I followed the instructions in the Ryzen 3600 page you sent. Both failed. It is definitely a Windows issue and they are aware of it for couple of months but no solution so far. 

0 Likes

aalmutairi, it has gotten to the point I really do not know what you are doing.  You have no need for CPU-Z or Paint 3D, so skip that.  If you are running Hyper-V and WSL and Virtual Machine Platform, then that may be you problem.  I just turned on WSL and Virtual Machine Platform, installed Ubuntu 2004 and played for a minute - knowing nothing about Linux.  Please turn Hyper-V OFF, turn Virtual Machine Platform ON  along with WSL and try your test.  Have you tried running CB under W10 under Hyper-V to see if it uses all 128 threads?  If not, please do.

W10 is far, far from fragile!  This may be a WSL problem but probably not a W10 problem.  URL is where I got my answer about NUMA on Hyper-V, so open a thread in this same forum and see what Microsoft says.  I will post the URL in my next post to prevent a long delay.  Please post a screenshot of RM as I asked before.  Thanks and enjoy, John.

0 Likes

The reason why I switched to hyper-V is that WSL and VMP was giving me the same issue. I though hyper-V will be better but it showed exactly the same issue. As for CB and the other test, the results will be the same because I tested threaded jobs on the virtual machine (which is what CB is doing), and it was still threading on half of the logical cores. I am still not sure what RM brings to the table, the same info can be accessed through ASUS tools too. 

0 Likes

CPU-Z can help me identify problems

0 Likes

aalmutairi, I will check lscpu when I have a moment.

I think I made it very clear that you should not run any ASUS applications.  They add "enhancements" which not only do not help, but can cause problems.  CPU-Z is fine when obtained from the CPU-Z web site but I would never run it from ASUS!

I cannot help you without your cooperation!  I need you to run the Hyper-V and CB test and I need a complete listing of ALL your parts.  I would like to see your CB scores.

I strongly recommend you do a Clear CMOS after removing all ASUS applications and before doing any more testing. Thanks and enjoy, John.

pastedImage_1.jpg

0 Likes

btw, since you are testing WSL with ubuntu, may you please check your cpu info through (lscpu) command?

0 Likes

Here:

pastedImage_1.jpg

EDIT: Please post your complete equivalent information.  Comparing mine to yours in your first post says the problem is right there.  No use running any test except lscpu.  Fix that and probably fix all.

EDIT: I just DLed and installed Debian and ran it in WSL.  Please go here and try a few more Linux versions - just need to do lscpu.

0 Likes

aalmutairi, please try a few more versions of Linux.  Search for "Manually download Windows Subsystem for Linux distro packages".  There are 10 to choose from.  Thanks and enjoy, John.

0 Likes

misterj‌, I am working on the W10 VM test, meanwhile can you post your lscpu in the following discussion:

WSL 2 uses half the number of cores on AMD Threadripper 3990X · Issue #5423 · microsoft/WSL · GitHub  

0 Likes

pastedImage_1.png

CB results on W10 is 23k while virtual W10 is 14k. What do you think?

0 Likes

previous task manager image is when I ran CB on virtual W10. 

0 Likes

aalmutairi wrote:

previous task manager image is when I ran CB on virtual W10. 

It appears that windows 10 recognizes your processor properly

https://www.amd.com/en/products/cpu/amd-ryzen-threadripper-3990x 

0 Likes

I meant the task manager on my machine when I ran CB on the virtual one. The task manager on the virtual machine only detects half of the cores

0 Likes

Try running Windows 10 on the machine with no hypervisor, i.e. bar metal

0 Likes

What do you mean? just run W10 normally?

0 Likes

aalmutairi wrote:

What do you mean? just run W10 normally?

yes

0 Likes

Yes, this is my issue. everything checks out until I use WSL or hyper V then it breaks?! this is what I am super confused.

0 Likes

aalmutairi wrote:

misterj, I am working on the W10 VM test, meanwhile can you post your lscpu in the following discussion:

WSL 2 uses half the number of cores on AMD Threadripper 3990X · Issue #5423 · microsoft/WSL · GitHub  

try using CentOS or Ubuntu

0 Likes

aalmutairi, I will read the forum you linked to when I have some free time, but I do not intend to post on yet another forum.  It does look like you are posting in the correct place because this is certainly not a Threadripper problem.  It is probably a WSL/W10/Linux problem - probably the special Linux to run under WSL.  If you want a copy of the image of lscpu I posted, simply right click the image above and then click Copy Image.  This will place the image on your Clipboard and you can put in the github forum.  Enjoy, John.

EDIT: Post a link to this forum thread in the github forum.

0 Likes

For everyone who has issues, check for a BIOS update for the motherboard which can better help the UEFI report the CPU capabilities. BIOS bugs are abundant so I make it a shop policy to update them all.

0 Likes