Hi,
I'd like to upgrade my workstation's GPU to a Radeon Pro W6800 but I haven't been able to find any official compatibility information for this card. The most I could find was that it's "supported on Linux" and that there's an entry for the card in the source code of the amdgpu driver.
Thus my question: Does the open-source AMD Linux driver stack support the Radeon Pro W6800? I know that I'll most likely need bleeding-edge software but I'm not afraid to compile the required packages myself.
System specifications:
CPU: Intel Xeon W-3265
Mainboard: Supermicro X11SPA-T
Chassis: Supermicro SC743TQ-1200B-SQ
OS: Debian 11 (Bullseye) 64-bit
Kernel: Linux 5.13.12
Graphics driver: Mesa 21.2.1, DRM 3.41.0, LLVM 12.0.1
OpenCL driver: AMD ROCm 4.3.0 (OpenCL-only installation, without HIP)
If you need any more info, let me know.
Thanks!
Jonathan
Solved! Go to Solution.
I am running W6800 on Ubuntu 20.04.2 (latest kernel) with no issues
Not sure, what exactly you are looking for. We do offer Linux drivers for W6800 on:
https://www.amd.com/de/support/professional-graphics/amd-radeon-pro/amd-radeon-pro-w6000-series/amd-...
But if you are looking for W6800 ROCm support, then this will be supported in ROCm version 4.5 (to be released in October)
Thanks for the super quick reply! ROCm / OpenCL isn't that important for me right now, I can live without it for a few weeks / months.
The most important thing for me is OpenGL and Vulkan support via the open-source (Mesa radeonsi) drivers. Does the W6800 work with upstream amdgpu + Mesa at the moment? Or will it only work via the drivers that you linked?
Yes, using the latest kernel, it should. Which Linux distro?
Debian 11 (Bullseye), although I've manually compiled the latest upstream kernel (5.13.12), as well as the latest development version of Mesa (21.2.1), given that Debian isn't really known for always shipping the latest versions of their software. (I know that I'll most likely also need the latest firmware blobs for the W6800, which shouldn't be a problem either.)
Also, sorry if I'm a little slow, the forum's flooding protection forces me to wait with posting.
I am running W6800 on Ubuntu 20.04.2 (latest kernel) with no issues
Should these drivers work for Pro V620 (same hardware) ?
The amdgpu kernel driver supports the V620 (as can be seen in its PCI ID list), so Mesa will be able to work with this card too as it supports the Navi21 chip. However, it might be difficult to get OpenGL running with the V620 since the card doesn't have any video outputs, which means that the OpenGL window system integration (WSI) might act up. Headless Vulkan with RADV should work fine though.
ROCm is also officially supported on the V620 (and W6800) now, as documented here: https://docs.amd.com/bundle/AMD-Radeon-PRO-V620-and-W6800-Support-Guide-v5.3/page/Introduction_to_AM...
So if you just want compute support (HIP/OpenCL), that should not be an issue at all. As always, try to run the latest kernel and Mesa/ROCm.
How did you get your hands onto a V620, though? As far as I can tell, AMD doesn't sell them.
Edit: I just saw your other post about the V620. There's some weirdness going on with its video BIOS related to SR-IOV, as outlined in the ROCm documentation I linked above. The card is likely in a "virtualization mode" and has a virtualization-only VBIOS on it. I'm not sure if it's possible to obtain a "graphics VBIOS" for this card. The W6800's VBIOS might potentially work, but it might also just brick the card instead.
Okay, I figured out some more useful things. The ROCm install guide mentions that you have to flash VBIOS version D603GLXE-077 onto the Radeon Pro W6800 in order to use it for virtualization with SR-IOV. The guide also lists an example system configuration with a Pro V620 that also uses VBIOS version D603GLXE-077. If I interpret this correctly, this means that the V620 comes with the virtualization-only VBIOS by default, but it also means that the V620 and W6800 can use the same VBIOS images since one step in the installation guide is to flash the V620's VBIOS onto a W6800 to enable virtualization using the workstation card. So in theory, it should be possible to flash the W6800's VBIOS onto a V620 (since it works the other way around)... At least if the documentation is correct. It's also certainly possible that the graphics VBIOS for the W6800 is different than that for the V620 due to the difference in CU count.
It looks like the W6800's VBIOS is already available on techpowerup: https://www.techpowerup.com/vgabios/236046/amd-radeonprow6800-32768-210422
Before you do this, you should back up your V620 VBIOS (it's not available anywhere on the internet yet), and you should also keep in mind that flashing the W6800 VBIOS might very well brick your card.
Edit: It might also be a good idea to ask the seller if they can provide you with the proper VBIOS.
Edit2: Full dmesg output would also help.
Thanks for trying to help. I tried to get info on the card under Linux using amdvbflash_linux_4.71 from techpowerup. It didn't report that it has found the card, while lspci clearly shows it is there. So if I can't backup the bios and can't flash the recommended bios, I guess that either they sent me a bad card or it is unsupported.
The seller says: "Ask AMD support".
amdvbflash only works if the amdgpu driver is unloaded. Have you done that? If not, try blacklisting it in /etc/modprobe.d, then update your initramfs and reboot. Then check which driver is loaded with lspci -nnk (it should not list "Kernel driver in use: amdgpu" for the card anymore). You should also not see amdgpu trying to start in dmesg.
Alternatively, you could try passing amdgpu.runpm=0 on the kernel command-line to prevent amdgpu from putting the card into a low-power state (in which the flash controller is disabled).
You have interesting observations - I haven't noticed that the bios image is the same.
I ran with amdgpu unloaded, but I unloaded it manually with modprobe -r. So in theory, amdgpu could have put the card in low power mode, but it always failed to load with some message, so more likely it failed. And I didn't observe low power with this card - it gets pretty hot if not properly cooled. So I think amdgpu driver failed to do that. By rough estimate, the card uses around 60-70W on idle, so clearly the driver couldn't command it.
If it could, it wouldn't fail to load but would create a /dev/dri entry (I guess) which would allow me to load openCL or other compute library on top of it and I wouldn't ask for help.
It looks like somebody before me has bricked the card and the seller re-sold it to me. I don't want to enter into attempts to repair it, so I am returning it.
In theory, if I had one working, I could try to copy the BIOS from the working to this one voiding the warranty of both in the process. Or I could learn how to convert the rom file to bin file and flash this with eeprom programmer.
It could have been a very good deal if it worked (w the advertised FP32 performance), because I wanted to use the brand new PyTorch support which some AMD cards recently acquired.