Linux Kernel: 5.4.98 (using any of the drivers cause the same issue).
Ubuntu 20.04.2 LTS (Focal Fossa)
Cards: 6800 x5
Installed with: ./amdgpu-pro-install --headless --opencl=rocr
AMD Driver: 20.45
dmesg output:
Clinfo output:
Executing clinfo, always stack at this line and never finish.
At first glance it appears that SBIOS and OS are having trouble allocating BARs for all those cards. If you start with a single card in the system does OpenCL seem to work properly ?
Unplugging all cards except one, same issue. Stacked on clinfo with just one card..
Do you want me to provide dmesg again? Probably it's the same as the old one?
Just want to add, I managed to install them 3 days ago with 20.40 drivers and have them all running. But the only behavior is that temps are high and the naming of GPUs was Unknown GPU in clinfo - but no stacking..
Yes please for the dmesg, in case there are other clues.
EDIT - hold on, just remembered something else about clinfo - IIRC we provide a clinfo that is a bit different from the one that the distro installs by default. Do you remember if you installed our clinfo ?
dmesg - https://termbin.com/31xr
Both are stacking, if I start - /opt/amdgpu-pro/bin/clinfo - stacking
Starting default from system clinfo - stacking too
apt list --installed | grep amdgpu
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
amdgpu-core/unknown,now 20.45-1188099 all [installed,automatic]
amdgpu-dkms-firmware/unknown,now 1:5.6.20.906316-1188099 all [installed,automatic]
amdgpu-dkms/unknown,now 1:5.6.20.906316-1188099 all [installed]
amdgpu-pin/unknown,now 20.45-1188099 all [installed]
amdgpu-pro-core/unknown,now 20.45-1188099 all [installed,automatic]
amdgpu-pro-pin/unknown,now 20.45-1188099 all [installed]
amdgpu-pro-rocr-opencl/unknown,now 20.45-1188099 amd64 [installed]
clinfo-amdgpu-pro/unknown,now 20.45-1188099 amd64 [installed,automatic]
comgr-amdgpu-pro/unknown,now 1.7.0-1188099 amd64 [installed,automatic]
hip-rocr-amdgpu-pro/unknown,now 20.45-1188099 amd64 [installed,automatic]
hsa-runtime-rocr-amdgpu/unknown,now 1.2.0-1188099 amd64 [installed,automatic]
hsakmt-roct-amdgpu/unknown,now 1.0.9-1188099 amd64 [installed,automatic]
libdrm-amdgpu-amdgpu1/unknown,now 1:2.4.100-1188099 amd64 [installed,automatic]
libdrm-amdgpu-common/unknown,now 1.0.0-1188099 all [installed,automatic]
libdrm-amdgpu1/focal-updates,now 2.4.102-1ubuntu1~20.04.1 amd64 [installed,automatic]
libdrm2-amdgpu/unknown,now 1:2.4.100-1188099 amd64 [installed,automatic]
ocl-icd-libopencl1-amdgpu-pro/unknown,now 20.45-1188099 amd64 [installed,automatic]
opencl-rocr-amdgpu-pro/unknown,now 20.45-1188099 amd64 [installed,automatic]
xserver-xorg-video-amdgpu/focal,now 19.1.0-1 amd64 [installed,automatic]
clinfo installed
apt list --installed | grep clinfo
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
clinfo-amdgpu-pro/unknown,now 20.45-1188099 amd64 [installed,automatic]
clinfo/focal,now 2.2.18.04.06-1 amd64 [installed]
Bleah... the forum is eating my responses again.
The only obvious difference in the logs is that you don't seem to have a display attached to the card - if you have a spare display handy or could move the primary display to the 6800 that would be a useful data point.
Nothing.. same issue still present.
OK, now that is strange. Could you grab a dmesg output with a display attached ?
IIRC you have another device used as display (Intel CPU ?). That shouldn't be a problem but if it were my system the next thing I would try is booting with that other display HW disabled in BIOS and display attached only to the 6800.
EDIT - just in case the problem is specific to clinfo rather than the drivers (unlikely but it's not clear what is going on here) do you have any other simple OpenCL programs you could try ?
Any suggestion? :_)
Only one suggestion... to pray for AMD driver creators' health... to encourage them to make a working driver :P. I have very similar issues and 4 of 6900... so :D... you may take a look at my very similar issue... https://community.amd.com/t5/drivers-software/2-x-radeon-rx-6900-xt-on-the-ubuntu-20-04-gnu-linux/m-... Thanks!