I have a Vega 56. I downloaded the 17.50 driver, and installed it using:
amdgpu-pro-install --compute
# rpm -qa | grep amdgpu | sort
amdgpu-core-17.50-543815.el7.noarch
amdgpu-dkms-17.50-543815.el7.noarch
amdgpu-pro-core-17.50-543815.el7.noarch
clinfo-amdgpu-pro-17.50-543815.el7.x86_64
ids-amdgpu-1.0.0-543815.el7.noarch
libdrm-amdgpu-2.4.82-543815.el7.x86_64
libopencl-amdgpu-pro-17.50-543815.el7.x86_64
libopencl-amdgpu-pro-icd-17.50-543815.el7.x86_64
opencl-amdgpu-pro-17.50-543815.el7.x86_64
That installed the driver. Reboot. The kernel driver loads, but I cannot get clinfo to do anything but crash:
# /opt/amdgpu-pro/bin/clinfo
terminate called after throwing an instance of 'cl::Error'
what(): clGetPlatformIDs
Aborted
# modprobe amdgpu
# lsmod | grep amdgpu
amdgpu 3144014 2
amdttm 110970 1 amdgpu
amdkcl 24897 3 amdgpu,amdkfd,amdttm
i2c_algo_bit 13413 2 i915,amdgpu
drm_kms_helper 159169 3 i915,amdgpu,nvidia_drm
drm 370825 15 i915,drm_kms_helper,amdgpu,amdkcl,amdttm,nvidia_drm
i2c_core 40756 8 drm,i915,i2c_i801,i2c_hid,drm_kms_helper,i2c_algo_bit,amdgpu,nvidia
# lspci | grep AMD
0c:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1470 (rev c3)
0d:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1471
0e:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega [Radeon RX Vega] (rev c3)
0e:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aaf8
10:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1470 (rev c3)
11:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1471
12:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega [Radeon RX Vega] (rev c3)
12:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aaf8
# dmesg | grep amdgpu
[ 48.247014] [drm] amdgpu kernel modesetting enabled.
[ 48.255686] amdgpu 0000:0e:00.0: enabling device (0006 -> 0007)
[ 48.256348] amdgpu 0000:0e:00.0: VRAM: 8176M 0x000000F400000000 - 0x000000F5FEFFFFFF (8176M used)
[ 48.256349] amdgpu 0000:0e:00.0: GTT: 256M 0x000000F5FF000000 - 0x000000F60EFFFFFF
[ 48.256484] [drm] amdgpu: 8176M of VRAM memory ready
[ 48.256485] [drm] amdgpu: 32028M of GTT memory ready.
[ 48.256784] amdgpu 0000:0e:00.0: irq 149 for MSI/MSI-X
[ 48.256797] amdgpu 0000:0e:00.0: amdgpu: using MSI.
[ 48.256962] [drm] amdgpu: irq initialized.
[ 48.257136] amdgpu: [powerplay] amdgpu: powerplay sw initialized
[ 48.258210] amdgpu 0000:0e:00.0: fence driver on ring 0 use gpu addr 0x000000f5ff400040, cpu addr 0xffffc90007614040
[ 48.258279] amdgpu 0000:0e:00.0: fence driver on ring 1 use gpu addr 0x000000f5ff4000c0, cpu addr 0xffffc900076140c0
[ 48.258321] amdgpu 0000:0e:00.0: fence driver on ring 2 use gpu addr 0x000000f5ff400140, cpu addr 0xffffc90007614140
[ 48.258415] amdgpu 0000:0e:00.0: fence driver on ring 3 use gpu addr 0x000000f5ff4001c0, cpu addr 0xffffc900076141c0
[ 48.258444] amdgpu 0000:0e:00.0: fence driver on ring 4 use gpu addr 0x000000f5ff400240, cpu addr 0xffffc90007614240
[ 48.258464] amdgpu 0000:0e:00.0: fence driver on ring 5 use gpu addr 0x000000f5ff4002c0, cpu addr 0xffffc900076142c0
[ 48.258483] amdgpu 0000:0e:00.0: fence driver on ring 6 use gpu addr 0x000000f5ff400340, cpu addr 0xffffc90007614340
[ 48.258503] amdgpu 0000:0e:00.0: fence driver on ring 7 use gpu addr 0x000000f5ff4003c0, cpu addr 0xffffc900076143c0
[ 48.258523] amdgpu 0000:0e:00.0: fence driver on ring 8 use gpu addr 0x000000f5ff400440, cpu addr 0xffffc90007614440
[ 48.258546] amdgpu 0000:0e:00.0: fence driver on ring 9 use gpu addr 0x000000f5ff4004e0, cpu addr 0xffffc900076144e0
[ 48.259015] amdgpu 0000:0e:00.0: fence driver on ring 10 use gpu addr 0x000000f5ff400560, cpu addr 0xffffc90007614560
[ 48.259046] amdgpu 0000:0e:00.0: fence driver on ring 11 use gpu addr 0x000000f5ff4005e0, cpu addr 0xffffc900076145e0
[ 48.264589] amdgpu 0000:0e:00.0: fence driver on ring 12 use gpu addr 0x000000f4008fa8c0, cpu addr 0xffffc9000c35b8c0
[ 48.264627] amdgpu 0000:0e:00.0: fence driver on ring 13 use gpu addr 0x000000f5ff4006e0, cpu addr 0xffffc900076146e0
[ 48.264654] amdgpu 0000:0e:00.0: fence driver on ring 14 use gpu addr 0x000000f5ff400760, cpu addr 0xffffc90007614760
[ 48.264788] amdgpu 0000:0e:00.0: fence driver on ring 15 use gpu addr 0x000000f5ff4007e0, cpu addr 0xffffc900076147e0
[ 48.264812] amdgpu 0000:0e:00.0: fence driver on ring 16 use gpu addr 0x000000f5ff400860, cpu addr 0xffffc90007614860
[ 48.264833] amdgpu 0000:0e:00.0: fence driver on ring 17 use gpu addr 0x000000f5ff4008e0, cpu addr 0xffffc900076148e0
[ 48.583772] [drm] amdgpu: freesync_module init done ffff8807ef2524c0.
[ 48.711184] amdgpu 0000:0e:00.0: fb1: amdgpudrmfb frame buffer device
[ 48.711200] amdgpu 0000:0e:00.0: ring 0(gfx) uses VM inv eng 4 on hub 0
[ 48.711201] amdgpu 0000:0e:00.0: ring 1(comp_1.0.0) uses VM inv eng 5 on hub 0
[ 48.711202] amdgpu 0000:0e:00.0: ring 2(comp_1.1.0) uses VM inv eng 6 on hub 0
[ 48.711203] amdgpu 0000:0e:00.0: ring 3(comp_1.2.0) uses VM inv eng 7 on hub 0
[ 48.711204] amdgpu 0000:0e:00.0: ring 4(comp_1.3.0) uses VM inv eng 8 on hub 0
[ 48.711205] amdgpu 0000:0e:00.0: ring 5(comp_1.0.1) uses VM inv eng 9 on hub 0
[ 48.711206] amdgpu 0000:0e:00.0: ring 6(comp_1.1.1) uses VM inv eng 10 on hub 0
[ 48.711207] amdgpu 0000:0e:00.0: ring 7(comp_1.2.1) uses VM inv eng 11 on hub 0
[ 48.711207] amdgpu 0000:0e:00.0: ring 8(comp_1.3.1) uses VM inv eng 12 on hub 0
[ 48.711208] amdgpu 0000:0e:00.0: ring 9(kiq_2.1.7) uses VM inv eng 13 on hub 0
[ 48.711209] amdgpu 0000:0e:00.0: ring 10(sdma0) uses VM inv eng 4 on hub 1
[ 48.711210] amdgpu 0000:0e:00.0: ring 11(sdma1) uses VM inv eng 5 on hub 1
[ 48.711211] amdgpu 0000:0e:00.0: ring 12(uvd) uses VM inv eng 6 on hub 1
[ 48.711212] amdgpu 0000:0e:00.0: ring 13(uvd_enc0) uses VM inv eng 7 on hub 1
[ 48.711213] amdgpu 0000:0e:00.0: ring 14(uvd_enc1) uses VM inv eng 8 on hub 1
[ 48.711214] amdgpu 0000:0e:00.0: ring 15(vce0) uses VM inv eng 9 on hub 1
[ 48.711215] amdgpu 0000:0e:00.0: ring 16(vce1) uses VM inv eng 10 on hub 1
[ 48.711216] amdgpu 0000:0e:00.0: ring 17(vce2) uses VM inv eng 11 on hub 1
[ 48.712240] amdgpu 0000:12:00.0: enabling device (0006 -> 0007)
[ 48.713696] amdgpu 0000:12:00.0: VRAM: 8176M 0x000000F400000000 - 0x000000F5FEFFFFFF (8176M used)
[ 48.713697] amdgpu 0000:12:00.0: GTT: 256M 0x000000F5FF000000 - 0x000000F60EFFFFFF
[ 48.713715] [drm] amdgpu: 8176M of VRAM memory ready
[ 48.713717] [drm] amdgpu: 32028M of GTT memory ready.
[ 48.714024] amdgpu 0000:12:00.0: irq 150 for MSI/MSI-X
[ 48.714036] amdgpu 0000:12:00.0: amdgpu: using MSI.
[ 48.714202] [drm] amdgpu: irq initialized.
[ 48.714255] amdgpu: [powerplay] amdgpu: powerplay sw initialized
[ 48.714751] amdgpu 0000:12:00.0: fence driver on ring 0 use gpu addr 0x000000f5ff400040, cpu addr 0xffffc90007eea040
[ 48.714823] amdgpu 0000:12:00.0: fence driver on ring 1 use gpu addr 0x000000f5ff4000c0, cpu addr 0xffffc90007eea0c0
[ 48.714846] amdgpu 0000:12:00.0: fence driver on ring 2 use gpu addr 0x000000f5ff400140, cpu addr 0xffffc90007eea140
[ 48.714868] amdgpu 0000:12:00.0: fence driver on ring 3 use gpu addr 0x000000f5ff4001c0, cpu addr 0xffffc90007eea1c0
[ 48.714889] amdgpu 0000:12:00.0: fence driver on ring 4 use gpu addr 0x000000f5ff400240, cpu addr 0xffffc90007eea240
[ 48.714907] amdgpu 0000:12:00.0: fence driver on ring 5 use gpu addr 0x000000f5ff4002c0, cpu addr 0xffffc90007eea2c0
[ 48.714926] amdgpu 0000:12:00.0: fence driver on ring 6 use gpu addr 0x000000f5ff400340, cpu addr 0xffffc90007eea340
[ 48.714945] amdgpu 0000:12:00.0: fence driver on ring 7 use gpu addr 0x000000f5ff4003c0, cpu addr 0xffffc90007eea3c0
[ 48.714963] amdgpu 0000:12:00.0: fence driver on ring 8 use gpu addr 0x000000f5ff400440, cpu addr 0xffffc90007eea440
[ 48.714986] amdgpu 0000:12:00.0: fence driver on ring 9 use gpu addr 0x000000f5ff4004e0, cpu addr 0xffffc90007eea4e0
[ 48.715292] amdgpu 0000:12:00.0: fence driver on ring 10 use gpu addr 0x000000f5ff400560, cpu addr 0xffffc90007eea560
[ 48.715328] amdgpu 0000:12:00.0: fence driver on ring 11 use gpu addr 0x000000f5ff4005e0, cpu addr 0xffffc90007eea5e0
[ 48.720789] amdgpu 0000:12:00.0: fence driver on ring 12 use gpu addr 0x000000f4008fa8c0, cpu addr 0xffffc9000d45b8c0
[ 48.720819] amdgpu 0000:12:00.0: fence driver on ring 13 use gpu addr 0x000000f5ff4006e0, cpu addr 0xffffc90007eea6e0
[ 48.721718] amdgpu 0000:12:00.0: fence driver on ring 14 use gpu addr 0x000000f5ff400760, cpu addr 0xffffc90007eea760
[ 48.721787] amdgpu 0000:12:00.0: fence driver on ring 15 use gpu addr 0x000000f5ff4007e0, cpu addr 0xffffc90007eea7e0
[ 48.721981] amdgpu 0000:12:00.0: fence driver on ring 16 use gpu addr 0x000000f5ff400860, cpu addr 0xffffc90007eea860
[ 48.722720] amdgpu 0000:12:00.0: fence driver on ring 17 use gpu addr 0x000000f5ff4008e0, cpu addr 0xffffc90007eea8e0
[ 49.041342] [drm] amdgpu: freesync_module init done ffff8807fcf1c2e0.
[ 49.170471] amdgpu 0000:12:00.0: fb2: amdgpudrmfb frame buffer device
[ 49.170486] amdgpu 0000:12:00.0: ring 0(gfx) uses VM inv eng 4 on hub 0
[ 49.170488] amdgpu 0000:12:00.0: ring 1(comp_1.0.0) uses VM inv eng 5 on hub 0
[ 49.170489] amdgpu 0000:12:00.0: ring 2(comp_1.1.0) uses VM inv eng 6 on hub 0
[ 49.170490] amdgpu 0000:12:00.0: ring 3(comp_1.2.0) uses VM inv eng 7 on hub 0
[ 49.170492] amdgpu 0000:12:00.0: ring 4(comp_1.3.0) uses VM inv eng 8 on hub 0
[ 49.170493] amdgpu 0000:12:00.0: ring 5(comp_1.0.1) uses VM inv eng 9 on hub 0
[ 49.170494] amdgpu 0000:12:00.0: ring 6(comp_1.1.1) uses VM inv eng 10 on hub 0
[ 49.170495] amdgpu 0000:12:00.0: ring 7(comp_1.2.1) uses VM inv eng 11 on hub 0
[ 49.170497] amdgpu 0000:12:00.0: ring 8(comp_1.3.1) uses VM inv eng 12 on hub 0
[ 49.170498] amdgpu 0000:12:00.0: ring 9(kiq_2.1.7) uses VM inv eng 13 on hub 0
[ 49.170499] amdgpu 0000:12:00.0: ring 10(sdma0) uses VM inv eng 4 on hub 1
[ 49.170501] amdgpu 0000:12:00.0: ring 11(sdma1) uses VM inv eng 5 on hub 1
[ 49.170502] amdgpu 0000:12:00.0: ring 12(uvd) uses VM inv eng 6 on hub 1
[ 49.170503] amdgpu 0000:12:00.0: ring 13(uvd_enc0) uses VM inv eng 7 on hub 1
[ 49.170504] amdgpu 0000:12:00.0: ring 14(uvd_enc1) uses VM inv eng 8 on hub 1
[ 49.170505] amdgpu 0000:12:00.0: ring 15(vce0) uses VM inv eng 9 on hub 1
[ 49.170507] amdgpu 0000:12:00.0: ring 16(vce1) uses VM inv eng 10 on hub 1
[ 49.170508] amdgpu 0000:12:00.0: ring 17(vce2) uses VM inv eng 11 on hub 1
All the programs detect Nvidia cards as OpenCL capable, but no matter what paths and libraries from /opt/amdgpu-pro/ subtree I put in LD_LIBRARY_PATH or LD_PRELOAD, nothing ever manages to find any AMD GPUs as OpenCL capable. I even moved all the Nvidia libraries out of the way, but all that does is stop Nvidia cards from getting detected - it doesn't help detection of the AMD GPUs. What am I doing wrong?
I wanted to post this in the OpenCL forum, but I cannot seem to post there directly. 😞
Please make sure that you've followed these installation steps as described here: Installation Instructions for amdgpu Pro / amdgpu All Open Graphics Stacks
Also, please don't use APP SDK with amdgpu-pro and remove if any exists.
P.S. You've been whitelisted now.
Regards,
Thanks for the whitelisting. I pasted above what I did, which I got from the instructions you linked and "amdgpu-pro-install --help". Is that not correct?
I don't even know what this "APP SDK" you mention is or where to get it, so I don't think I have installed it.
I am using CentOS 7 with the CentOS kernel (couldn't get the kernel driver to build with the mainline 4.9.x LT kernel).
Here is another attempt:
# uname -r
3.10.0-693.17.1.el7.x86_64
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# ./amdgpu-pro-install --opencl=rocm --headless
[amdgpu-pro-local]
Name=AMD amdgpu Pro local repository
baseurl=file:///var/opt/amdgpu-pro-local
enabled=1
gpgcheck=0
Loaded plugins: fastestmirror
amdgpu-pro-local | 2.9 kB 00:00:00
[...]
Dependencies Resolved
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Installing:
rocm-amdgpu-pro x86_64 17.50-511655.el7 amdgpu-pro-local 2.3 k
Installing for dependencies:
amdgpu-core noarch 17.50-511655.el7 amdgpu-pro-local 2.2 k
amdgpu-pro-core noarch 17.50-511655.el7 amdgpu-pro-local 2.2 k
hsa-ext-amdgpu-pro-finalize x86_64 1.1.6-511655.el7 amdgpu-pro-local 2.9 M
hsa-ext-amdgpu-pro-image x86_64 1.1.6-511655.el7 amdgpu-pro-local 137 k
hsa-runtime-tools-amdgpu-pro x86_64 1.1.6-511655.el7 amdgpu-pro-local 512 k
rocm-amdgpu-pro-icd x86_64 17.50-511655.el7 amdgpu-pro-local 17 M
rocm-amdgpu-pro-opencl x86_64 17.50-511655.el7 amdgpu-pro-local 2.0 k
rocr-amdgpu-pro x86_64 1.1.6-511655.el7 amdgpu-pro-local 243 k
roct-amdgpu-pro x86_64 1.0.7-511655.el7 amdgpu-pro-local 47 k
Transaction Summary
====================================================================================================================================
Install 1 Package (+9 Dependent packages)
Total download size: 21 M
Installed size: 21 M
Is this ok [y/d/N]: y
Downloading packages:
------------------------------------------------------------------------------------------------------------------------------------
Total 173 MB/s | 21 MB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : amdgpu-core-17.50-511655.el7.noarch 1/10
Installing : amdgpu-pro-core-17.50-511655.el7.noarch 2/10
Installing : roct-amdgpu-pro-1.0.7-511655.el7.x86_64 3/10
Installing : rocr-amdgpu-pro-1.1.6-511655.el7.x86_64 4/10
Installing : rocm-amdgpu-pro-opencl-17.50-511655.el7.x86_64 5/10
Installing : rocm-amdgpu-pro-icd-17.50-511655.el7.x86_64 6/10
Installing : hsa-ext-amdgpu-pro-finalize-1.1.6-511655.el7.x86_64 7/10
Installing : hsa-ext-amdgpu-pro-image-1.1.6-511655.el7.x86_64 8/10
Installing : hsa-runtime-tools-amdgpu-pro-1.1.6-511655.el7.x86_64 9/10
Installing : rocm-amdgpu-pro-17.50-511655.el7.x86_64 10/10
Verifying : hsa-ext-amdgpu-pro-finalize-1.1.6-511655.el7.x86_64 1/10
Verifying : rocr-amdgpu-pro-1.1.6-511655.el7.x86_64 2/10
Verifying : rocm-amdgpu-pro-icd-17.50-511655.el7.x86_64 3/10
Verifying : rocm-amdgpu-pro-17.50-511655.el7.x86_64 4/10
Verifying : amdgpu-pro-core-17.50-511655.el7.noarch 5/10
Verifying : rocm-amdgpu-pro-opencl-17.50-511655.el7.x86_64 6/10
Verifying : roct-amdgpu-pro-1.0.7-511655.el7.x86_64 7/10
Verifying : hsa-ext-amdgpu-pro-image-1.1.6-511655.el7.x86_64 8/10
Verifying : hsa-runtime-tools-amdgpu-pro-1.1.6-511655.el7.x86_64 9/10
Verifying : amdgpu-core-17.50-511655.el7.noarch 10/10
Installed:
rocm-amdgpu-pro.x86_64 0:17.50-511655.el7
Dependency Installed:
amdgpu-core.noarch 0:17.50-511655.el7 amdgpu-pro-core.noarch 0:17.50-511655.el7
hsa-ext-amdgpu-pro-finalize.x86_64 0:1.1.6-511655.el7 hsa-ext-amdgpu-pro-image.x86_64 0:1.1.6-511655.el7
hsa-runtime-tools-amdgpu-pro.x86_64 0:1.1.6-511655.el7 rocm-amdgpu-pro-icd.x86_64 0:17.50-511655.el7
rocm-amdgpu-pro-opencl.x86_64 0:17.50-511655.el7 rocr-amdgpu-pro.x86_64 0:1.1.6-511655.el7
roct-amdgpu-pro.x86_64 0:1.0.7-511655.el7
Complete!
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]#
This doesn't seem to install the kernel driver at all, so:
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# yum install amdgpu-dkms
[...]
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Installing:
amdgpu-dkms noarch 17.50-511655.el7 amdgpu-pro-local 7.1 M
Transaction Summary
====================================================================================================================================
Install 1 Package
Total download size: 7.1 M
Installed size: 7.1 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : amdgpu-dkms-17.50-511655.el7.noarch 1/1
Loading new amdgpu-17.50-511655.el7 DKMS files...
dpkg: warning: version '3.10.0-693.17.1.el7.x86_64' has bad syntax: invalid character in revision number
dpkg: warning: version '3.10.0-693.17.1.el7.x86_64' has bad syntax: invalid character in revision number
dpkg: warning: version '4.9.81-1.el7.centos.x86_64' has bad syntax: invalid character in revision number
dpkg: warning: version '3.10.0-693.17.1.el7.x86_64' has bad syntax: invalid character in revision number
Building for 3.10.0-693.17.1.el7.x86_64 4.9.81-1.el7.centos.x86_64
Building initial module for 3.10.0-693.17.1.el7.x86_64
Done.
Forcing installation of amdgpu
amdgpu:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/
amdttm.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/
amdkcl.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/
amdkfd.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/
Adding any weak-modules
depmod....
Backing up initramfs-3.10.0-693.17.1.el7.x86_64.img to /boot/initramfs-3.10.0-693.17.1.el7.x86_64.img.old-dkms
Making new initramfs-3.10.0-693.17.1.el7.x86_64.img
(If next boot fails, revert to initramfs-3.10.0-693.17.1.el7.x86_64.img.old-dkms image)
dracut.......
DKMS: install completed.
Building initial module for 4.9.81-1.el7.centos.x86_64
Error! Bad return status for module build on kernel: 4.9.81-1.el7.centos.x86_64 (x86_64)
Consult /var/lib/dkms/amdgpu/17.50-511655.el7/build/make.log for more information.
warning: %post(amdgpu-dkms-0:17.50-511655.el7.noarch) scriptlet failed, exit status 10
Non-fatal POSTIN scriptlet failure in rpm package amdgpu-dkms-17.50-511655.el7.noarch
Verifying : amdgpu-dkms-17.50-511655.el7.noarch 1/1
Installed:
amdgpu-dkms.noarch 0:17.50-511655.el7
Complete!
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# lsmod | grep amdgpu
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# modprobe amdgpu
[root@grumpy ~/ati/amdgpu-pro-17.50-511655]# lsmod | grep amdgpu
amdgpu 3143876 2
amdttm 110970 1 amdgpu
amdkcl 24897 3 amdgpu,amdkfd,amdttm
i2c_algo_bit 13413 2 i915,amdgpu
drm_kms_helper 159169 3 i915,amdgpu,nvidia_drm
drm 370825 15 i915,drm_kms_helper,amdgpu,amdkcl,amdttm,nvidia_drm
i2c_core 40756 8 drm,i915,i2c_i801,i2c_hid,drm_kms_helper,i2c_algo_bit,amdgpu,nvidia
At this point there is no clinfo command available:
[root@grumpy ~]# yum install clinfo
[...]
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Installing:
clinfo x86_64 2.1.17.02.09-1.el7 epel 39 k
Transaction Summary
====================================================================================================================================
Install 1 Package
Total download size: 39 k
Installed size: 83 k
Is this ok [y/d/N]: y
Downloading packages:
clinfo-2.1.17.02.09-1.el7.x86_64.rpm | 39 kB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : clinfo-2.1.17.02.09-1.el7.x86_64 1/1
Verifying : clinfo-2.1.17.02.09-1.el7.x86_64 1/1
Installed:
clinfo.x86_64 0:2.1.17.02.09-1.el7
Complete!
[root@grumpy ~]# clinfo
Number of platforms 0
[root@grumpy ~]# LD_LIBRARY_PATH=/opt/amdgpu-pro/lib64 /usr/local/bin/ethminer --list-devices
✘ 11:41:11|ethminer No OpenCL platforms found
Listing CUDA devices.
FORMAT: [deviceID] deviceName
[0] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[1] GeForce GTX 980 Ti
Compute version: 5.2
cudaDeviceProp::totalGlobalMem: 6373572608
[2] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[3] GeForce GTX 1070
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[4] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[5] GeForce GTX 1070 Ti
Compute version: 6.1
cudaDeviceProp::totalGlobalMem: 8508145664
[root@grumpy ~]# yum install clinfo-amdgpu-pro-17.50-511655.el7.x86_64
[...]
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Installing:
clinfo-amdgpu-pro x86_64 17.50-511655.el7 amdgpu-pro-local 198 k
Installing for dependencies:
libopencl-amdgpu-pro x86_64 17.50-511655.el7 amdgpu-pro-local 11 k
libopencl-amdgpu-pro-icd x86_64 17.50-511655.el7 amdgpu-pro-local 29 M
Transaction Summary
====================================================================================================================================
Install 1 Package (+2 Dependent packages)
Total download size: 29 M
Installed size: 29 M
Is this ok [y/d/N]: y
Downloading packages:
------------------------------------------------------------------------------------------------------------------------------------
Total 345 MB/s | 29 MB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : libopencl-amdgpu-pro-17.50-511655.el7.x86_64 1/3
Installing : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 2/3
Installing : clinfo-amdgpu-pro-17.50-511655.el7.x86_64 3/3
Verifying : libopencl-amdgpu-pro-17.50-511655.el7.x86_64 1/3
Verifying : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 2/3
Verifying : clinfo-amdgpu-pro-17.50-511655.el7.x86_64 3/3
Installed:
clinfo-amdgpu-pro.x86_64 0:17.50-511655.el7
Dependency Installed:
libopencl-amdgpu-pro.x86_64 0:17.50-511655.el7 libopencl-amdgpu-pro-icd.x86_64 0:17.50-511655.el7
Complete!
Once that is install everything just outright segfaults.
[root@grumpy ~]# clinfo
Segmentation fault
[root@grumpy ~]# /opt/amdgpu-pro/bin/clinfo
Segmentation fault
[root@grumpy ~]# LD_LIBRARY_PATH=/opt/amdgpu-pro/lib64 /usr/local/bin/ethminer --list-devices
Segmentation fault
If I remove all the Nvidia OpenCL libraries and re-run ldconfig, everything still segfaults.
If I remove the amdgpu-pro opencl libraries, there is no libOpenCL.so so nothing finds it (I removed the Nvidia one earlier):
[root@grumpy ~]# yum remove clinfo-amdgpu-pro-17.50-511655.el7.x86_64 libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 libopencl-amdgpu-pro-17.50-511655.el7.x86_64
[...]
====================================================================================================================================
Package Arch Version Repository Size
====================================================================================================================================
Removing:
clinfo-amdgpu-pro x86_64 17.50-511655.el7 @amdgpu-pro-local 780 k
libopencl-amdgpu-pro x86_64 17.50-511655.el7 @amdgpu-pro-local 27 k
libopencl-amdgpu-pro-icd x86_64 17.50-511655.el7 @amdgpu-pro-local 102 M
Transaction Summary
====================================================================================================================================
Remove 3 Packages
Installed size: 103 M
Is this ok [y/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Erasing : clinfo-amdgpu-pro-17.50-511655.el7.x86_64 1/3
Erasing : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 2/3
Erasing : libopencl-amdgpu-pro-17.50-511655.el7.x86_64 3/3
Verifying : libopencl-amdgpu-pro-17.50-511655.el7.x86_64 1/3
Verifying : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 2/3
Verifying : clinfo-amdgpu-pro-17.50-511655.el7.x86_64 3/3
Removed:
clinfo-amdgpu-pro.x86_64 0:17.50-511655.el7 libopencl-amdgpu-pro.x86_64 0:17.50-511655.el7
libopencl-amdgpu-pro-icd.x86_64 0:17.50-511655.el7
Complete!
[root@grumpy ~]# clinfo
clinfo: error while loading shared libraries: libOpenCL.so.1: cannot open shared object file: No such file or directory
So which is the correct libOpenCL to use, and what package does it come from? The only one that ships with the driver packages results in nothing but segfaults.
At this point I'm reasonably sure this has nothing to do with interference from Nvidia drivers and libraries.
APP SDK provides OpenCL development environment for AMD platforms. It no longer compatible with amdgpu-pro drivers, so I asked you not to install this. Now, on Linux, all the required libraries (like libOpenCL) come with the amdgpu-pro itself.
As per the installation guide, I think, you need to follow below steps after extracting the driver package (for CentOS 7.4):
If you still see the issue, please report to our support forum here: Drivers & Software
Regards,
Right, but the only libOpenCL.so comes with the legacy OpenCL packages, not with the ROCm packages.
Installer help implies that ROCm should be used for Vega.
I followed the instructions exactly, the amdgpu-pro-preinstall.sh only checks for and set up needed repositories that aren't already there.
amdgpu-pro-preinstall.sh --check
reported everything was ready.
I installed, as you can see above, with:
amdgpu-pro-install -y --opencl=rocm --headless
because I _only_ want OpenCL support, rather than any of the Xorg drivers.
I'll repost in Drivers & Software.
Did you examine /var/lib/dkms/amdgpu/17.50-511655.el7/build/make.log ?
Probably something changed with the latest kernel and the source cannot be compiled..
I'm on the same boat; lost a lot of hair since Monday.. Still no go.
Fedora 27 kernel 4.15.7-300.fc27.x86_64
On Linux, it's CentOS 7 or nothing, it seems. And even that only has a chance of properly working on a PCIe 3.0 system.
My advice would be to return the Vega to the place where you bought it as unfit for purpose, that's what I did.
This was my once a decade excursion back into AMD land from Nvidia, and AMD have _again_ spectacularly failed to produce a product that passes even my most basic fitness for purpose glance test.
Actually the drivers work on other systems too. For example, I'm writing this on a Fedora 27 with amdgpu 17.50.