cancel
Showing results for 
Search instead for 
Did you mean: 

Drivers & Software

c_zagarskas
Adept II

amdgpu-pro 20.45 amdgpu-dkms fails/error - SOLUTION: for Ubuntu 20.04.2 LTS + instructions

Based on previous unsatisfactory experiences with attempting to install amdgpu-pro on Ubuntu 20 LTS I found no 'clear instructions', no patches (and IMHO) no reliable solution. 

 

Ergo, Here is the solution:

SUMMARY: Roll back Ubuntu headers and install older version of amdgpu-pro. 

  • linux-headers-5.4.0-58-generic headers work
  • amdgpu-pro-20.40-1147286-ubuntu-20.04 works with those headers

DETAILED INSTRUCTIONS:

 

 

# first I will rollback and uninstall everything
amdgpu-uninstall -y
amdgpu-pro-uninstall -y
sudo apt autoremove
sudo apt clean
sudo apt-get autoremove

 

 

now, remove the old crash log from failed installs

 

 

# to remove the old locked log in case a new crash happens
sudo rm /var/crash/amdgpu-dkms.0.crash

 

 

Prepare to Downgrade - I strongly suggest you have physical access to the box as you will need it to select the proper headers using the advanced grub boot menu.

 

 

# first, I will check my header versions and write them down
dpkg --list | grep linux-image | grep ^ii

 

 

In my case I had header version 5.8.0-45

ergo I needed to prepare the following statements:

  • sudo apt remove linux-modules-extra-5.8.0-45-generic
  • sudo apt remove linux-modules-5.8.0-45-generic
  • sudo apt remove linux-image-5.8.0-45-generic

do not run those statements yet, save them, they will be used later. your VERSION will replace my version, but I suspect you are using 5.8.0-45 headers if you are here on this post...

Now, I will install the OLD headers that I know for sure work, using the following commands, in this order:

 

 

# lets get the headers that I know work with amdgpu-pro 20.40
sudo apt-get install linux-image-5.4.0-58-generic
sudo apt-get install linux-modules-5.4.0-58-generic
sudo apt-get install linux-modules-extra-5.4.0-58-generic

 

 

Simply running "sudo apt install linux-generic" is not enough, installing linux-generic headers without the modules and the extras will cause various peripherals like keyboards and wifi (which will not work on next boot). Ergo, notice how I have selected the image, it's modules and it's modules-extras for 5.4.0-58 - you may need to do some googling here as I found they do not always name them the same way...

Next, I did a reboot to try out those headers and make sure all is well.

Note - we can not delete the old headers when they are in use, a reboot is required, also note that having multiple versions of the headers "left on the system" will cause amdgpu-pro to become "confused" during the install. 

 

 

# lets restart the machine
reboot
# => human will hold ESC during the BIOS boot
# => human enter into the GRUB BOOTLOADER -> ADVANCED 
# => human select the UBUNTU 5.4.0-58 version (do not use safemode)

 

 

If your keyboard does not work during the login screen make sure you installed the extra modules, ect... Do NOT run 'sudo apt update' or anything else.

Login - I checked to make sure everything works - I ran some apps, browsed the internet, I tested my system in general. Once I was satisfied its time to delete the OLD headers.

This should be done in the following order:  

  1. modules-extra
  2. modules
  3. image

Now, in my case I had 5.8.0-45 headers, and I installed 5.4.0-58, I am booted into 5.4.0-58 and I desire to remove 5.8.0-45. 

 

 

## First, I want to double check what headers I am running:
uname -r

 

 

My result was "5.4.0-58-generic" - good.

 

 

# Next I want to confirm what I have installed
dpkg --list | grep linux-image | grep ^ii

 

 

My result was:

  • linux-image-5.8.0-45-generic 5.8.0-45.64 amd64 Signed kernel image generic
  • linux-image-5.4.0-58-generic 5.4.0-58.64 amd64 Signed kernel image generic
  • Note- you may have more to remove... if so, do it slowly, one at a time
  • do NOT delete anything that does not also have a version number

Now its time to run the statement I prepared earlier

 

 

# I will now remove my old headers in this order, and then reboot
sudo apt remove linux-modules-extra-5.8.0-45-generic
sudo apt remove linux-modules-5.8.0-45-generic
sudo apt remove linux-image-5.8.0-45-generic
reboot

 

 

This time I will not enter into advanced grub bootloader, I will simply allow my system to boot.

Provided we did not FUBAR anything its just Ububntu 20.4.2 LTS with older headers.

On boot I will double check

 

 

# lets make sure I am ready
uname -r
dpkg --list | grep linux-image | grep ^ii

 

 

My results were as follows 

  • 5.4.0-58-generic
  • linux-image-5.4.0-58-generic 5.4.0-58.64 amd64 Signed kernel image generic

I am ready to install the older version of amdgpu-pro, one I know works.

PS: consider I checked other systems and decided on the 5.4.0-58 headers and amdgou-pro 20.40 because I know 100% that works as I could see it in my other boxes.

That said, you may try other headers and versions of amdgpu-pro, but if you do be prepared to do this whole process again... (lol, I did, and I ended up with failures - noted at bottom)

Lets grab amdgpu-pro-20.40-1147286-ubuntu-20.04 from this page

 

 

# I extracted the package here and moved into the directory
cd /home/MY_USERNAME/Desktop/ubuntu-setup/drivers/amdgpu-pro-20.40-1147286-ubuntu-20.04

# and now its time to install, as follows, for a VEGA56/64
./amdgpu-pro-install -y --opencl=pal,legacy --headless

 

 

Note- you may need other versions, if so, READ the index.html file in the /docs/ subfolder of the extracted driver package carefully to determine what version you need/want. Note how the ROCM is NOT included in 20.40

At this point I had a successful install. No errors, no warnings, installed as expected - now its time to test and prepare.

 

 

# Ensure that my user account is a member of the "video" group 
groups
     
# add myself to the video group with either
sudo usermod -a -G video $LOGNAME 
sudo usermod -a -G video MY_USERNAME

 

 

 

OPTIONAL

depending on what you are doing with your box, you may want to consider this old Ubuntu 16 patch... I always run it, not sure if its necessary. (I think it does something important for my 'specific usage case' - cheers )

Here is a guide for editing in nano: https://linuxize.com/post/how-to-use-nano-text-editor/ 

 

 

# if you do not have nano editor, then install it
sudo apt install nano

# Edit GRUB for AMD GPU's
sudo nano /etc/default/grub

# Add the following line:
GRUB_CMDLINE_LINUX="amdgpu.vm_fragment_size=9"

# Now update grub and reboot
sudo update-grub
sudo reboot

# check page size
getconf PAGESIZE

 

 

 

CONTINUE 

Either way you have to reboot

If everything went well we are ready to check to see if the install was successful. if it did not, then see below for information about the "black screen"

 

 

## to test if OpenCL is even on the system
sudo apt install clinfo
clinfo

 

 

I got the following results, which is what is on my other boxes, and is good

Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP (3180.7)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Platform Host timer resolution 1ns
Platform Extensions function suffix AMD

Platform Name AMD Accelerated Parallel Processing

Now I want to check amdgpu-pro

 

 

# praytell sir, are we finally good?
apt show amdgpu-pro

 

 

I got the following results, which is what is on my other boxes, and is good

Package: amdgpu-pro
Version: 20.40-1147286
Priority: optional
Section: metapackages
Maintainer: Advanced Micro Devices (AMD) <slava.grigorev@amd.com>
Installed-Size: 17.4 kB
Depends: amdgpu (= 20.40-1147286), amdgpu-pro-core (= 20.40-1147286), libgl1-amdgpu-pro-glx (= 20.40-1147286), libegl1-amdgpu-pro (= 20.40-1147286), libgles2-amdgpu-pro (= 20.40-1147286), libglapi1-amdgpu-pro (= 20.40-1147286), libgl1-amdgpu-pro-ext (= 20.40-1147286), libgl1-amdgpu-pro-dri (= 20.40-1147286), libgl1-amdgpu-pro-appprofiles (= 20.40-1147286)
Download-Size: 5,332 B
APT-Sources: file:/var/opt/amdgpu-pro-local ./ Packages
Description: Meta package to install amdgpu Pro components.

### DONE ###

Now I can finally use amdgpu-pro

 

PROBLEMS TO CONSIDER:

I have tried all of the following headers, none of which worked with amdgpu-pro

 

 

# not work - uninstalled
sudo apt remove linux-modules-extra-5.8.0-45-generic 
sudo apt remove linux-modules-5.8.0-45-generic
sudo apt remove linux-image-5.8.0-45-generic

# not work - uninstalled
sudo apt remove linux-modules-extra-5.8.0-44-generic 
sudo apt remove linux-modules-5.8.0-44-generic
sudo apt remove linux-image-5.8.0-44-generic

# not work - uninstalled
sudo apt remove linux-modules-extra-5.4.0-67-generic 
sudo apt remove linux-modules-5.4.0-67-generic
sudo apt remove linux-image-5.4.0-67-generic

# not work - uninstalled
sudo apt remove linux-modules-extra-5.4.0-62-generic
sudo apt remove linux-modules-5.4.0-62-generic
sudo apt remove linux-image-5.4.0-62-generic

 

 

I also tried the following amdgpu-pro drivers - which did not work in various combinations... 

  • amdgpu-pro-20.45-1188099-ubuntu-20.04
  • amdgpu-pro-20.20-1098277-ubuntu-20.04
  • amdgpu-pro-17.40-492261

Important considerations:

*cheers 

 

1 Reply
ErnestoAntonio
Journeyman III

Hi, I have been having a hard time with this. I installed a new card (VisionTek Radeon 6350 DMS59 6350DMS1GB2) on my Dell Optiplex 9020. Initially running Ubuntu 16.04, I then upgraded to 18.04 and finally to 20.04 hoping OS would identify it.

Since that didnt do anything, I started troubleshooting. Initially the card was presenting "display UNCLAIMED", I was able to clear that. I then tried several things, including this step by step, all of it. But after installing amdgpu, you say:

"At this point I had a successful install. No errors, no warnings, installed as expected - now its time to test and prepare."

  but I am getting an error at the end: WARNING: amdgpu dkms failed for running kernel.

I am not a Linux expert, I apologize if I have not provided all the needed information on this first post. 

One thing I noticed and I am not sure if the "driver=radeon" has anything to do with all this.

$ dpkg --list | grep linux-image | grep ^ii
ii linux-image-5.4.0-58-generic 5.4.0-58.64 amd64 Signed kernel image generic

$ sudo lshw -C display
*-display
description: VGA compatible controller
product: Cedar [Radeon HD 5000/6000/7350/8350 Series]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:01:00.0
version: 00
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
configuration: driver=radeon latency=0
resources: irq:32 memory:e0000000-efffffff memory:f7e20000-f7e3ffff ioport:e000(size=256) memory:c0000-dffff

0 Likes