cancel
Showing results for 
Search instead for 
Did you mean: 

Drivers & Software

nibal
Challenger

Ubuntu 20.04 looses Xserver after installing latest driver (amdgpu-install) for my Radeon RX6400

Hi,

I used:

-> sudo amdgpu-install --usecase=kdm,graphics,opencl --opencl=rocr,legacy

I used to have a nice plasma desktop, now I am booted to console. sddm is starting without any errors:

-> pgrep -fl sddm

-> 1614 sddm

but I get no login screen:(

Seems that graphics usecase messed up my environment:(

I am typing this from my windows system. I have a dual boot system.

 

TIA

Nikos

0 Likes
1 Solution

This is now from my Linux box:)

I got it working.

I indicated in my last mail that this looked like an installation error.

I uninstalled everything, including the amdgpu-install package, autoremoved everything, rebooted and reinstalled everything.

Tried:

-> xinit

No joy. No plasma, and /var/log/Xorg.0.log had the same error: /dev/dri/card0 missing.

Understandable. Driver was created, but not loaded to the kernel. Don't quite understand why the install

script, that does so many things, doesn't attempt to do a modprobe:(

-> sudo modprobe amdgpu

Same error: Exec format error.

This i don't understand. OK so we can't yet load amdgpu, because dependency modules were not loaded yet.

But "Exec format error"? Doesn't make sense. amdgpu.ko was just created by the install script for the running kernel.

How can it be incompatible with the running kernel?

I rebooted and this time I got my plasma desktop back:)

It shouldn't take so long to install a video card, almost 2 weeks now, without any help from here:(

One last note:

You need to fix in the amdgpu-install script the udev rule for video.

It should read:

KERNEL=="kfd", GROUP="video", MODE="0660"

Note the single '=' for GROUP="video"

As it is:

KERNEL=="kfd", GROUP=="video", MODE="0660"

it errs in syslog:

systemd-udevd[27852]: /etc/udev/rules.d/70-amdgpu.rules:1 Inv
alid operator for GROUP.

View solution in original post

0 Likes
11 Replies
dipak
Big Boss

Below are some suggestions:

1) Please make sure that the setup is compatible the latest driver amdgpu 21-50-2. For example, as the release note says, this driver supports only HWE kernels, and it doesn’t support OEM kernels. The compatible Ubuntu versions are:

Ubuntu 20.04.4 HWE
Ubuntu 18.04.5(6) HWE

2) When installing the driver, try "all-open" use-cases. For example:

amdgpu-install --usecase=graphics,opencl

[Note:  "legacy" OpenCL stack is only required if you need support for legacy products older than Vega10. In that case, "workstation" use-cases are needed]

Thanks.

0 Likes

Thx for the fast reply,

 

I installed 20.04 LTS desktop, but soon after and before installing the amd drivers, I upgraded the kernel to the latest hwe.

I assume that counts as hwe install:)

/var/log/Xorg.log shows this error:

(EE) Screen 0 deleted because of no matching config section

I went to /usr/share/X11/xorg.d/* and started adding a section "Screen" to the 3 more relevant configurations:

00_amdgpu.conf, 10-amdgpu.conf and 10-radeon.conf.

Xserver looks at this directory when starting up. Unfortunately, so far I am still getting the same error.

From memory, this is what I added:

Section "Screen"

     Identifier "Screen 0"

     Device "radeon"

     Option "AllowEmptyInitialConfiguration"

EndSection

Seems to me that Xorg complains that a Screen 0 section is missing from the configuration. Any ideas?

TIA

Nikos

 

0 Likes
nibal
Challenger

I think that this thread belongs to support/drivers:)

Can you move it?

0 Likes

Sure, I'm moving it to the "drivers and software" support forum.

Thanks.

0 Likes

Xorg reads its configuration from /var/share/X11/xorg.conf.d. Then it complains about its configuration "missing a Screen 0 section". This could be misleading. I have seen the same error with dual video cards, where Screen 0 is misconfigured with the wrong driver whereas Xorg thought as 0 the other video card. In general it seems that this is a general error for many misconfigurations:( I have tried my best to add a "Screen 0" section to the configuration, without any success:(

There is no "Screen 0" Section anywhere in the configuration. Also the same directory was used for xorg configuration before I installed my RX6400 card, when I had an older R9 270 Curacao Radeon card. It is possible, when the amdgpu driver installed in the same directory, it clashed with the older configuration in there.

-> cd /var/share/X11/xorg.conf.d

-> ls

-> 00-amdgpu.conf 10-amdgpu.conf 10-quirks.conf 10-radeon.conf 40-libinput.conf 70-wacom.conf

quirks, libinput and wacom refer to input device configuration, tablets, mice, etc. I have tried removing the radeon configuration, since it could be from the previous card, but I keep getting the same error. What configurations does the amdgpu driver install? Do these files look OK to you?

I am out of ideas and don't know about Xorg configuration. Any suggestions?

TIA

Nikos

0 Likes

I will try reinstalling the amdgpu driver. Unfortunately it doesn't uninstall like the previous amd linux drivers:( I will try also to change the kernel to the hwe-edge version, since I have read that the amd framebuffers block the hwe kernel from creating the dri devices (/dev/dri/card0).

dipak, you mentioned about using the all-open switch when installing the driver. What does this cover? I think I need the legacy ocl, since I use software that runs on 1.2 ocl. I imagine that rocr is only 3.0. Any other options to watch for the driver installation?

TIA

Nikos

 

0 Likes

There is an amdgpu-uninstall:)

 

TIA

Nikos

0 Likes

dipak, you mentioned about using the all-open switch when installing the driver. What does this cover? I think I need the legacy ocl, since I use software that runs on 1.2 ocl. I imagine that rocr is only 3.0. Any other options to watch for the driver installation?

Please refer the below sections in the installation guide:

install-overview.html#stack-use-cases

Installing or Uninstalling the AMDGPU stack

Thanks.

0 Likes

Hi,

Thanks for the docs. I used pretty much the same procs I found in the web.

For uninstall, I ran the amdgpu-uninstall script, removed the amdpdu package, autoremoved all unneeded, installed hwe-edge kernel and rebooted the system.

Then I installed amdgpu with:

-> dpkg -i amdgpu...

-> sudo amdgpu-install -y --accept-eula --no-32 --usecase=graphics,opencl,openclSDK --opencl=rocr,legacy

-> startx

-> grep EE /var/log/Xorg.0.log

(EE)/dev/dri/card0: No such file or directory

(EE) Screen 0 deleted missing from configuration.

Still no Xwindows:( Seems a kernel problem, it doesn't build the dri devices. Need to configure and build the kernel again:(

It should be less trouble to install a video card:(

BR

Nikos

0 Likes

Talking with the kernel people, It's not the kernel that creates /dev/rdi/card0, but the card driver, amdgpu.

-> dkms status amdgpu

Shows that amdgpu is installed in the current kernel

-> sudo lshw -c video

Shows that video is Unclaimed. That means no driver is assigned to it.

-> sudo hwinfo --gfxcard

Shows that I need to insert the driver into the kernel

-> sudo modprobe amdgpu

Skipping invalid relocation target, existing value is nonzero for type 1, loc: val:

Error: Cannot insert module. Exec format error:(

This looks like an installation error. What do you think?

All these are from memory. I have no X in Ubuntu, and have to dual boot to Windows to post here:(

0 Likes

This is now from my Linux box:)

I got it working.

I indicated in my last mail that this looked like an installation error.

I uninstalled everything, including the amdgpu-install package, autoremoved everything, rebooted and reinstalled everything.

Tried:

-> xinit

No joy. No plasma, and /var/log/Xorg.0.log had the same error: /dev/dri/card0 missing.

Understandable. Driver was created, but not loaded to the kernel. Don't quite understand why the install

script, that does so many things, doesn't attempt to do a modprobe:(

-> sudo modprobe amdgpu

Same error: Exec format error.

This i don't understand. OK so we can't yet load amdgpu, because dependency modules were not loaded yet.

But "Exec format error"? Doesn't make sense. amdgpu.ko was just created by the install script for the running kernel.

How can it be incompatible with the running kernel?

I rebooted and this time I got my plasma desktop back:)

It shouldn't take so long to install a video card, almost 2 weeks now, without any help from here:(

One last note:

You need to fix in the amdgpu-install script the udev rule for video.

It should read:

KERNEL=="kfd", GROUP="video", MODE="0660"

Note the single '=' for GROUP="video"

As it is:

KERNEL=="kfd", GROUP=="video", MODE="0660"

it errs in syslog:

systemd-udevd[27852]: /etc/udev/rules.d/70-amdgpu.rules:1 Inv
alid operator for GROUP.

0 Likes