cancel
Showing results for 
Search instead for 
Did you mean: 

Drivers & Software

drmike
Journeyman III

Linux boot - amdgpu seg fault

I am reasonably confident the following log was caused by an NVMe disk fault, but I'd like to find a way to prove it:

Oct  9 21:26:00 Relativity gnome-session[1962]: amdgpu_device_initialize: amdgpu_get_auth (1) failed (-1)
Oct  9 21:26:00 Relativity kernel: [    8.429119] gnome-session-c[1985]: segfault at 10 ip 00007efbef1e6530 sp 00007ffbffff9f70 error 4 in libdrm_amdgpu.so.1.0.0[7efbef1de000+c000]
Oct  9 21:26:00 Relativity kernel: [    8.429122] Code: ba 20 00 00 00 be 00 00 00 00 48 89 c7 e8 58 a5 ff ff 48 8b 45 b8 48 89 45 d0 8b 45 c0 89 45 d8 8b 45 c4 89 45 dc 48 8b 45 c8 <8b> 40 10 48 8d 55 d0 b9 20 00 00 00 be 05 00 00 00 89 c7 e8 48 a7
Oct  9 21:26:00 Relativity gnome-shell[1997]: Unable to initialize Clutter: Unable to initialize the Clutter backend: no available drivers found.

Where can I find (or recreate) the file libdrm_amdgpu.so.1.0.0 so I can compare it with what is now on my drive?  This will help to determine if it is bit, byte, line or block error. 

The amdgpu driver was working fine for at least 1 month, and then it failed on boot with no warning.  I had another problem a month ago with the Linux kernel failing similarly - works fine for a month and then on boot just fails.  Now that I at least have a kernel to work with, I can at least prove if the file went bad.  If the file is the same, I'll have a lot of different questions!

Thank you for any help, it will be greatly appreciated.

Device: Radeon 5700XT

Motherboard: Aorus Master (Gigabyte)

0 Likes
1 Reply
drmike
Journeyman III

The web page for the amdgpu driver says there is a program amdgpu-pro-uninstall, but this does not exist in the amdgpu-pro-19.30-855429-ubuntu-18.04 driver package.  I tried rerunning the install package, but it did not do anything.  How can I uninstall the driver so it might be reinstalled again?

Edit:  The trick was to enable networking in rescue mode.  Then the driver run routine worked, and over wrote the bad file.  My system is back in business.  The command -h did not work with amdgpu-pro-install without it.  I think the -h option should work, and should tell the user you won't get very far without a network connection.

0 Likes