Hardware: Vega 56
OS: CentOS 7 with the CentOS kernel (couldn't get the kernel driver to build with the mainline 4.9.x LT kernel).
I cannot seem to get OpenCL support on the Vega to be recognized at all. Here is a trimmed down terminal transcript of what I did.
# uname -r
3.10.0-693.17.1.el7.x86_64
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# ./amdgpu-pro-install --opencl=rocm --headless
[amdgpu-pro-local]
Name=AMD amdgpu Pro local repository
baseurl=file:///var/opt/amdgpu-pro-local
enabled=1
gpgcheck=0
Loaded plugins: fastestmirror
amdgpu-pro-local | 2.9 kB 00:00:00
[...]
Dependencies Resolved
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Installing:
rocm-amdgpu-pro x86_64 17.50-511655.el7 amdgpu-pro-local 2.3 k
Installing for dependencies:
amdgpu-core noarch 17.50-511655.el7 amdgpu-pro-local 2.2 k
amdgpu-pro-core noarch 17.50-511655.el7 amdgpu-pro-local 2.2 k
hsa-ext-amdgpu-pro-finalize x86_64 1.1.6-511655.el7 amdgpu-pro-local 2.9 M
hsa-ext-amdgpu-pro-image x86_64 1.1.6-511655.el7 amdgpu-pro-local 137 k
hsa-runtime-tools-amdgpu-pro x86_64 1.1.6-511655.el7 amdgpu-pro-local 512 k
rocm-amdgpu-pro-icd x86_64 17.50-511655.el7 amdgpu-pro-local 17 M
rocm-amdgpu-pro-opencl x86_64 17.50-511655.el7 amdgpu-pro-local 2.0 k
rocr-amdgpu-pro x86_64 1.1.6-511655.el7 amdgpu-pro-local 243 k
roct-amdgpu-pro x86_64 1.0.7-511655.el7 amdgpu-pro-local 47 k
Transaction Summary
====================================================================================================================================
Install 1 Package (+9 Dependent packages)
Total download size: 21 M
Installed size: 21 M
Is this ok [y/d/N]: y
Downloading packages:
------------------------------------------------------------------------------------------------------------------------------------
Total 173 MB/s | 21 MB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : amdgpu-core-17.50-511655.el7.noarch 1/10
Installing : amdgpu-pro-core-17.50-511655.el7.noarch 2/10
Installing : roct-amdgpu-pro-1.0.7-511655.el7.x86_64 3/10
Installing : rocr-amdgpu-pro-1.1.6-511655.el7.x86_64 4/10
Installing : rocm-amdgpu-pro-opencl-17.50-511655.el7.x86_64 5/10
Installing : rocm-amdgpu-pro-icd-17.50-511655.el7.x86_64 6/10
Installing : hsa-ext-amdgpu-pro-finalize-1.1.6-511655.el7.x86_64 7/10
Installing : hsa-ext-amdgpu-pro-image-1.1.6-511655.el7.x86_64 8/10
Installing : hsa-runtime-tools-amdgpu-pro-1.1.6-511655.el7.x86_64 9/10
Installing : rocm-amdgpu-pro-17.50-511655.el7.x86_64 10/10
Verifying : hsa-ext-amdgpu-pro-finalize-1.1.6-511655.el7.x86_64 1/10
Verifying : rocr-amdgpu-pro-1.1.6-511655.el7.x86_64 2/10
Verifying : rocm-amdgpu-pro-icd-17.50-511655.el7.x86_64 3/10
Verifying : rocm-amdgpu-pro-17.50-511655.el7.x86_64 4/10
Verifying : amdgpu-pro-core-17.50-511655.el7.noarch 5/10
Verifying : rocm-amdgpu-pro-opencl-17.50-511655.el7.x86_64 6/10
Verifying : roct-amdgpu-pro-1.0.7-511655.el7.x86_64 7/10
Verifying : hsa-ext-amdgpu-pro-image-1.1.6-511655.el7.x86_64 8/10
Verifying : hsa-runtime-tools-amdgpu-pro-1.1.6-511655.el7.x86_64 9/10
Verifying : amdgpu-core-17.50-511655.el7.noarch 10/10
Installed:
rocm-amdgpu-pro.x86_64 0:17.50-511655.el7
Dependency Installed:
amdgpu-core.noarch 0:17.50-511655.el7 amdgpu-pro-core.noarch 0:17.50-511655.el7
hsa-ext-amdgpu-pro-finalize.x86_64 0:1.1.6-511655.el7 hsa-ext-amdgpu-pro-image.x86_64 0:1.1.6-511655.el7
hsa-runtime-tools-amdgpu-pro.x86_64 0:1.1.6-511655.el7 rocm-amdgpu-pro-icd.x86_64 0:17.50-511655.el7
rocm-amdgpu-pro-opencl.x86_64 0:17.50-511655.el7 rocr-amdgpu-pro.x86_64 0:1.1.6-511655.el7
roct-amdgpu-pro.x86_64 0:1.0.7-511655.el7
Complete!
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]#
This doesn't seem to install the kernel driver at all, so:
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# yum install amdgpu-dkms
[...]
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Installing:
amdgpu-dkms noarch 17.50-511655.el7 amdgpu-pro-local 7.1 M
Transaction Summary
====================================================================================================================================
Install 1 Package
Total download size: 7.1 M
Installed size: 7.1 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : amdgpu-dkms-17.50-511655.el7.noarch 1/1
Loading new amdgpu-17.50-511655.el7 DKMS files...
dpkg: warning: version '3.10.0-693.17.1.el7.x86_64' has bad syntax: invalid character in revision number
dpkg: warning: version '3.10.0-693.17.1.el7.x86_64' has bad syntax: invalid character in revision number
dpkg: warning: version '4.9.81-1.el7.centos.x86_64' has bad syntax: invalid character in revision number
dpkg: warning: version '3.10.0-693.17.1.el7.x86_64' has bad syntax: invalid character in revision number
Building for 3.10.0-693.17.1.el7.x86_64 4.9.81-1.el7.centos.x86_64
Building initial module for 3.10.0-693.17.1.el7.x86_64
Done.
Forcing installation of amdgpu
amdgpu:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/
amdttm.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/
amdkcl.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/
amdkfd.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/
Adding any weak-modules
depmod....
Backing up initramfs-3.10.0-693.17.1.el7.x86_64.img to /boot/initramfs-3.10.0-693.17.1.el7.x86_64.img.old-dkms
Making new initramfs-3.10.0-693.17.1.el7.x86_64.img
(If next boot fails, revert to initramfs-3.10.0-693.17.1.el7.x86_64.img.old-dkms image)
dracut.......
DKMS: install completed.
Building initial module for 4.9.81-1.el7.centos.x86_64
Error! Bad return status for module build on kernel: 4.9.81-1.el7.centos.x86_64 (x86_64)
Consult /var/lib/dkms/amdgpu/17.50-511655.el7/build/make.log for more information.
warning: %post(amdgpu-dkms-0:17.50-511655.el7.noarch) scriptlet failed, exit status 10
Non-fatal POSTIN scriptlet failure in rpm package amdgpu-dkms-17.50-511655.el7.noarch
Verifying : amdgpu-dkms-17.50-511655.el7.noarch 1/1
Installed:
amdgpu-dkms.noarch 0:17.50-511655.el7
Complete!
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# lsmod | grep amdgpu
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# modprobe amdgpu
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# lsmod | grep amdgpu
amdgpu 3143876 2
amdttm 110970 1 amdgpu
amdkcl 24897 3 amdgpu,amdkfd,amdttm
i2c_algo_bit 13413 2 i915,amdgpu
drm_kms_helper 159169 3 i915,amdgpu,nvidia_drm
drm 370825 15 i915,drm_kms_helper,amdgpu,amdkcl,amdttm,nvidia_drm
i2c_core 40756 8 drm,i915,i2c_i801,i2c_hid,drm_kms_helper,i2c_algo_bit,amdgpu,nvidia
At this point there is no clinfo command available:
[root@grumpy ~]# yum install clinfo
[...]
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Installing:
clinfo x86_64 2.1.17.02.09-1.el7 epel 39 k
Transaction Summary
====================================================================================================================================
Install 1 Package
Total download size: 39 k
Installed size: 83 k
Is this ok [y/d/N]: y
Downloading packages:
clinfo-2.1.17.02.09-1.el7.x86_64.rpm | 39 kB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : clinfo-2.1.17.02.09-1.el7.x86_64 1/1
Verifying : clinfo-2.1.17.02.09-1.el7.x86_64 1/1
Installed:
clinfo.x86_64 0:2.1.17.02.09-1.el7
Complete!
[root@grumpy ~]# clinfo
Number of platforms 0
[root@grumpy ~]# LD_LIBRARY_PATH=/opt/amdgpu-pro/lib64 /usr/local/bin/ethminer --list-devices
✘ 11:41:11|ethminer No OpenCL platforms found
Listing CUDA devices.
FORMAT: [deviceID] deviceName
[0] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[1] GeForce GTX 980 Ti
Compute version: 5.2
cudaDeviceProp::totalGlobalMem: 6373572608
[2] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[3] GeForce GTX 1070
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[4] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[5] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[root@grumpy ~]# yum install clinfo-amdgpu-pro-17.50-511655.el7.x86_64
[...]
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Installing:
clinfo-amdgpu-pro x86_64 17.50-511655.el7 amdgpu-pro-local 198 k
Installing for dependencies:
libopencl-amdgpu-pro x86_64 17.50-511655.el7 amdgpu-pro-local 11 k
libopencl-amdgpu-pro-icd x86_64 17.50-511655.el7 amdgpu-pro-local 29 M
Transaction Summary
====================================================================================================================================
Install 1 Package (+2 Dependent packages)
Total download size: 29 M
Installed size: 29 M
Is this ok [y/d/N]: y
Downloading packages:
------------------------------------------------------------------------------------------------------------------------------------
Total 345 MB/s | 29 MB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : libopencl-amdgpu-pro-17.50-511655.el7.x86_64 1/3
Installing : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 2/3
Installing : clinfo-amdgpu-pro-17.50-511655.el7.x86_64 3/3
Verifying : libopencl-amdgpu-pro-17.50-511655.el7.x86_64 1/3
Verifying : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 2/3
Verifying : clinfo-amdgpu-pro-17.50-511655.el7.x86_64 3/3
Installed:
clinfo-amdgpu-pro.x86_64 0:17.50-511655.el7
Dependency Installed:
libopencl-amdgpu-pro.x86_64 0:17.50-511655.el7 libopencl-amdgpu-pro-icd.x86_64 0:17.50-511655.el7
Complete!
Once that is install everything just outright segfaults.
[root@grumpy ~]# clinfo
Segmentation fault
[root@grumpy ~]# /opt/amdgpu-pro/bin/clinfo
Segmentation fault
[root@grumpy ~]# LD_LIBRARY_PATH=/opt/amdgpu-pro/lib64 /usr/local/bin/ethminer --list-devices
Segmentation fault
If I remove all the Nvidia OpenCL libraries and re-run ldconfig, everything still segfaults.
If I remove the amdgpu-pro opencl libraries, there is no libOpenCL.so so nothing finds it (I removed the Nvidia one earlier):
[root@grumpy ~]# yum remove clinfo-amdgpu-pro-17.50-511655.el7.x86_64 libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 libopencl-amdgpu-pro-17.50-511655.el7.x86_64
[...]
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Removing:
clinfo-amdgpu-pro x86_64 17.50-511655.el7 @amdgpu-pro-local 780 k
libopencl-amdgpu-pro x86_64 17.50-511655.el7 @amdgpu-pro-local 27 k
libopencl-amdgpu-pro-icd x86_64 17.50-511655.el7 @amdgpu-pro-local 102 M
Transaction Summary
====================================================================================================================================
Remove 3 Packages
Installed size: 103 M
Is this ok [y/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Erasing : clinfo-amdgpu-pro-17.50-511655.el7.x86_64 1/3
Erasing : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 2/3
Erasing : libopencl-amdgpu-pro-17.50-511655.el7.x86_64 3/3
Verifying : libopencl-amdgpu-pro-17.50-511655.el7.x86_64 1/3
Verifying : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 2/3
Verifying : clinfo-amdgpu-pro-17.50-511655.el7.x86_64 3/3
Removed:
clinfo-amdgpu-pro.x86_64 0:17.50-511655.el7 libopencl-amdgpu-pro.x86_64 0:17.50-511655.el7
libopencl-amdgpu-pro-icd.x86_64 0:17.50-511655.el7
Complete!
[root@grumpy ~]# clinfo
clinfo: error while loading shared libraries: libOpenCL.so.1: cannot open shared object file: No such file or directory
So which is the correct libOpenCL to use, and what package does it come from? The only one that ships with the driver packages results in nothing but segfaults.
At this point I'm reasonably sure this has nothing to do with interference from Nvidia drivers and libraries.
Note: I deleted the amdgpu kernel driver that ships with the kernel so that it wouldn't clash with the one built using DKMS from the driver bundle.
Just tried it on a completely different machine, with no Nvidia cards or drivers on it, and the results are the same.
distro clinfo reports 0 platforms.
amdgpu-pro clinfo crashes out.
How on earth can there be such a shortage of AMD GPUs when the software stack just doesn't work?
I'm using RX 580 and a GTX 1070 on Ubuntu and trying to get opencl to work. I'm also trying to understand why I get a "AMD ADL library not found." message. Maybe something in here will help...
After messing about for a few weeks with different drivers I did a fresh install today:
Ubuntu 16.04.03
$ uname -r
4.4.0-116-generic kernel
Then updated with
sudo apt-get update && sudo apt-get -y dist-upgrade
I've got a RX 580. To install I installed Ubuntu without graphics card using the onboard Intel chip. Then installed RX 580 into the PCIE 16 slot and attached the monitor to it.
Installed drivers with:
./amdgpu-install --opencl=legacy
$ethminer --list-devices
Listing OpenCL devices.
FORMAT: [platformID] [deviceID] deviceName
[0] [0] GeForce GTX 1070 Ti
CL_DEVICE_TYPE: GPU
CL_DEVICE_GLOBAL_MEM_SIZE: 8513978368
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 2128494592
CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024
[0] [1] GeForce GTX 1070 Ti
CL_DEVICE_TYPE: GPU
CL_DEVICE_GLOBAL_MEM_SIZE: 8513978368
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 2128494592
CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024
[1] [0] Ellesmere
CL_DEVICE_TYPE: GPU
CL_DEVICE_GLOBAL_MEM_SIZE: 5877137408
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 4244635648
CL_DEVICE_MAX_WORK_GROUP_SIZE: 256
Listing CUDA devices.
FORMAT: [deviceID] deviceName
[0] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8513978368
Pci: 0000:0c:00
[1] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8513978368
Pci: 0000:0e:00
Although this shows the devices that are opencl capable, I don't think that the opencl is being used.
clinfo is unavailable at this point in the installation.