cancel
Showing results for 
Search instead for 
Did you mean: 

Drivers & Software

tyalassio
Adept I

MAJOR issue with amdgpu-dkms...

Hi guys,

I am a professional IT staff for over 20 years. As of yesterday, please be noted that amd-dkms package for Ubuntu 22.04, 23.04 and 23.10 cannot be compiled anymore. Furthermore, on 23.04 it totally broken the dpkg system, and had to manually remove the entry after it failed to compile. The problem started yesterday right after a system update, I wasn't too sure what it was, but NOT a kernel update. It seemed a compiler update or a package updated from AMD as well.

The problem is that the failure to compile renders the system completely broken, with black screen and unable to install/remove any more packages unless dpkg is manually fixed.

Thank you,
Best Regards,
Simone

14 Replies
tyalassio
Adept I

I managed to recover the log of the compiler error during DKMS build (it's the same on all 3 versions of Ubuntu I tried):

Building initial module for 6.2.0-36-generic
ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/amdgpu-dkms.0.c
rash'
Error! Bad return status for module build on kernel: 6.2.0-36-generic (x86_64)
Consult /var/lib/dkms/amdgpu/6.2.4-1646729.22.04/build/make.log for more informa
tion.
dpkg: error processing package amdgpu-dkms (--configure):
 installed amdgpu-dkms package post-installation script subprocess returned erro
r exit status 10
Errors were encountered while processing:
 amdgpu-dkms
E: Sub-process /usr/bin/dpkg returned an error code (1)

 

And the error:

/var/lib/dkms/amdgpu/6.2.4-1646729.22.04/build/amd/amdgpu/../display/amdgpu_dm/>
<pu_dm/amdgpu_dm.c:3301:25: error: implicit declaration of function ‘drm_dp_mst>
 3301 |                         drm_dp_mst_hpd_irq(

 

  CC [M]  /var/lib/dkms/amdgpu/6.2.4-1646729.22.04/build/amd/amdgpu/../display/>
  CC [M]  /var/lib/dkms/amdgpu/6.2.4-1646729.22.04/build/amd/amdgpu/../display/>
cc1: some warnings being treated as errors

 

I kept the .log file above, if you need I can upload.

I'm having the same issue, in the crash report I find this:

ProblemType: Package
DKMSBuildLog:
DKMS make.log for amdgpu-6.2.4-1646729.22.04 for kernel 6.2.0-36-generic (x86_64)
Tue Nov 7 09:57:21 PM EST 2023
make: Entering directory '/usr/src/linux-headers-6.2.0-36-generic'
warning: the compiler differs from the one used to build the kernel
The kernel was built by: x86_64-linux-gnu-gcc-11 (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
You are using: gcc-11 (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

.....

cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:260: /var/lib/dkms/amdgpu/6.2.4-1646729.22.04/build/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [scripts/Makefile.build:512: /var/lib/dkms/amdgpu/6.2.4-1646729.22.04/build/amd/amdgpu] Error 2
make: *** [Makefile:2026: /var/lib/dkms/amdgpu/6.2.4-1646729.22.04/build] Error 2
make: Leaving directory '/usr/src/linux-headers-6.2.0-36-generic'
DKMSKernelVersion: 6.2.0-36-generic
Date: Tue Nov 7 21:58:05 2023
DuplicateSignature: dkms:amdgpu-dkms:1:6.2.4.50700-1646729.22.04:/var/lib/dkms/amdgpu/6.2.4-1646729.22.04/build/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:3301:25: error: implicit declaration of function ‘drm_dp_mst_hpd_irq’; did you mean ‘drm_dp_mst_dpcd_write’? [-Werror=implicit-function-declaration]
Package: amdgpu-dkms 1:6.2.4.50700-1646729.22.04
PackageVersion: 1:6.2.4.50700-1646729.22.04
SourcePackage: amdgpu-dkms
Title: amdgpu-dkms 1:6.2.4.50700-1646729.22.04: amdgpu kernel module failed to build

Uninstalling it completely broke my computer. I can no longer run graphics related programs, or even open steam.

0 Likes

Update:
I decided to install or at least look for a previous installation to install. It doesn't exactly work as there are some bugs, but it is manageable and runs as smoothly as possible.
Below is the url: https://www.amd.com/en/support/linux-drivers

specifically I installed: Radeon Software for Linux version 23.20.00.48 for Ubuntu 22.04.3 HWE with ROCm 5.7.
For some reason that fixed some of the issues. At least it compiles now.

0 Likes

Hi Dom, so it seems it was something updated in the AMD repo, I failed to notice at that moment, could have been as well the amdgpu-dkms package (it was a single or no more than 2 packages, because update list was very small). I'll hold on for someone to check. For now I can run NVIDIA video card instead (my laptop has dual card), fortunately I can enable discrete graphics only in BIOS.

0 Likes
tyalassio
Adept I

I am glad to say that the problem was solved, it can now compile again, and is as stable as ever. Normal driver can compile again.

Thank you,
Cheers,
Simone

0 Likes

please for the love of whatever god you praise POST THE SOLUTION THAT RESOLVED YOUR ISSUE.

Hi tyalassio,

Your post seems relevant for me as I have the same error-message on Ubuntu 23.10 - error message

when compiling dkms, so it never installs (+ it removes amdgpu from the default modprobe and I have to put it back by hand all the time) . Do you tell that, at least on Ubuntu 23.04 it compiled the dkms without errors? (maybe you used some tricks that you could tell)

p.s. I really need dkms as I want to install tensorflow/pytorch containers.

0 Likes

In case you're still having issues, I posted a reply in another thread here: https://community.amd.com/t5/drivers-software/amdgpu-install-build-error-on-on-kernel-6-2-0-25-gener...

To summarize: It looks like there's a newer version of amdgpu-install in AMD's ubuntu repo here https://repo.radeon.com/amdgpu-install/latest/ubuntu/jammy/

On the support page for my GPU (RX 7900 XT) it gives a download for amdgpu-install v5.7 while in the repo there's a v6.0.

0 Likes
mzalfres
Journeyman III

I was looking into DKMS logfiles and so far I feel that the errors are similar to those I got from legacy NVIDIA drivers, which I ported recently to the latest ubuntu 23.10 kernel for my daughter's laptop. There were some changes in at least several kernel function signatures which are in use in the driver. The only option to make it compiling is a source code fix. I'm going to try it, once I find the source packages and some time

0 Likes
tyalassio
Adept I

Here we go again.... Ubuntu 22.04, just received a Kernel upgrade, from the normal channel (not dev channel):

simone-lussardi@simonelussardi-Lenovo-ThinkBook-16p-Gen-2:~$ uname -r
6.5.0-14-generic


... amgdpu-dkms failed to compile. Guys I know is difficult to keep up with kernels, but you did state that Ubuntu 22.04 is "compatible and must keep updated in order to work well" with your driver. I am left again with a fully broken system and tons of job to do.

Cheers,
Best Regards,
Simone

tyalassio
Adept I

Guys, to be fair to AMD, another computer using exactly the same repos as mine did not receive the 6.5.0-14 upgrade this morning. I wonder whether Ubuntu released it by accident and then retreated it ... ?

https://askubuntu.com/questions/1499659/ubuntu-22-04-3-kernel-6-5-0-14-generic-and-rtl8111-8168-8411...

Other people received the upgrade so I guess it wasn't my mirror corrupted or something.

0 Likes

i am having the same issue with 6.5.0-14-generic, did you manage to fix this?

0 Likes

Sorry for late reply, I haven't done anything on it because the problem is in amd-dkms package, which is on AMD repo. Hopefully with the release of 24.04 LTS soon, they have to fix it, because almost certainly that version of Ubuntu is gonna have Kernel > 6.5....

But, there is a release done on February 16th. I am now out and without a system to break to test for this, you could give it a go if you want (re-install driver and everything) and see if it now compiles !

Cheers,
Simone

0 Likes

Buenas por si sirve, yo estoy intentando instalarlo con la version 6.5.0-15 y obtengo el problema igual.

Loading new amdgpu-6.2.4-1652687.22.04 DKMS files...
Building for 6.5.0-15-generic
Building for architecture x86_64
Building initial module for 6.5.0-15-generic
Error! Bad return status for module build on kernel: 6.5.0-15-generic (x86_64)
Consult /var/lib/dkms/amdgpu/6.2.4-1652687.22.04/build/make.log for more information.
dpkg: error al procesar el paquete amdgpu-dkms (--configure):
el subproceso instalado paquete amdgpu-dkms script post-installation devolvió el código de salida de error 10
Se encontraron errores al procesar:
amdgpu-dkms
E: Sub-process /usr/bin/dpkg returned an error code (1)

0 Likes