2 Replies Latest reply on Mar 7, 2018 2:48 PM by ben-and-ellen

    Cannot Get OpenCL on Linux to Work At All

    powerload

      Hardware: Vega 56

      OS: CentOS 7 with the CentOS kernel (couldn't get the kernel driver to build with the mainline 4.9.x LT kernel).

       

      I cannot seem to get OpenCL support on the Vega to be recognized at all. Here is a trimmed down terminal transcript of what I did.

      # uname -r

      3.10.0-693.17.1.el7.x86_64

      [root@grumpy ~/ati/amdgpu-pro-17.50-511655]# ./amdgpu-pro-install --opencl=rocm --headless

      [amdgpu-pro-local]

      Name=AMD amdgpu Pro local repository

      baseurl=file:///var/opt/amdgpu-pro-local

      enabled=1

      gpgcheck=0

       

       

      Loaded plugins: fastestmirror

      amdgpu-pro-local                                                                                            | 2.9 kB  00:00:00 

      [...]

      Dependencies Resolved

       

       

      ====================================================================================================================================

      Package                                  Arch                Version                          Repository                    Size

      ====================================================================================================================================

      Installing:

      rocm-amdgpu-pro                          x86_64              17.50-511655.el7                amdgpu-pro-local              2.3 k

      Installing for dependencies:

      amdgpu-core                              noarch              17.50-511655.el7                amdgpu-pro-local              2.2 k

      amdgpu-pro-core                          noarch              17.50-511655.el7                amdgpu-pro-local              2.2 k

      hsa-ext-amdgpu-pro-finalize              x86_64              1.1.6-511655.el7                amdgpu-pro-local              2.9 M

      hsa-ext-amdgpu-pro-image                  x86_64              1.1.6-511655.el7                amdgpu-pro-local              137 k

      hsa-runtime-tools-amdgpu-pro              x86_64              1.1.6-511655.el7                amdgpu-pro-local              512 k

      rocm-amdgpu-pro-icd                      x86_64              17.50-511655.el7                amdgpu-pro-local              17 M

      rocm-amdgpu-pro-opencl                    x86_64              17.50-511655.el7                amdgpu-pro-local              2.0 k

      rocr-amdgpu-pro                          x86_64              1.1.6-511655.el7                amdgpu-pro-local              243 k

      roct-amdgpu-pro                          x86_64              1.0.7-511655.el7                amdgpu-pro-local              47 k

       

       

      Transaction Summary

      ====================================================================================================================================

      Install  1 Package (+9 Dependent packages)

       

       

      Total download size: 21 M

      Installed size: 21 M

      Is this ok [y/d/N]: y

      Downloading packages:

      ------------------------------------------------------------------------------------------------------------------------------------

      Total                                                                                              173 MB/s |  21 MB  00:00:00 

      Running transaction check

      Running transaction test

      Transaction test succeeded

      Running transaction

        Installing : amdgpu-core-17.50-511655.el7.noarch                                                                            1/10

        Installing : amdgpu-pro-core-17.50-511655.el7.noarch                                                                        2/10

        Installing : roct-amdgpu-pro-1.0.7-511655.el7.x86_64                                                                        3/10

        Installing : rocr-amdgpu-pro-1.1.6-511655.el7.x86_64                                                                        4/10

        Installing : rocm-amdgpu-pro-opencl-17.50-511655.el7.x86_64                                                                  5/10

        Installing : rocm-amdgpu-pro-icd-17.50-511655.el7.x86_64                                                                    6/10

        Installing : hsa-ext-amdgpu-pro-finalize-1.1.6-511655.el7.x86_64                                                            7/10

        Installing : hsa-ext-amdgpu-pro-image-1.1.6-511655.el7.x86_64                                                                8/10

        Installing : hsa-runtime-tools-amdgpu-pro-1.1.6-511655.el7.x86_64                                                            9/10

        Installing : rocm-amdgpu-pro-17.50-511655.el7.x86_64                                                                        10/10

        Verifying  : hsa-ext-amdgpu-pro-finalize-1.1.6-511655.el7.x86_64                                                            1/10

        Verifying  : rocr-amdgpu-pro-1.1.6-511655.el7.x86_64                                                                        2/10

        Verifying  : rocm-amdgpu-pro-icd-17.50-511655.el7.x86_64                                                                    3/10

        Verifying  : rocm-amdgpu-pro-17.50-511655.el7.x86_64                                                                        4/10

        Verifying  : amdgpu-pro-core-17.50-511655.el7.noarch                                                                        5/10

        Verifying  : rocm-amdgpu-pro-opencl-17.50-511655.el7.x86_64                                                                  6/10

        Verifying  : roct-amdgpu-pro-1.0.7-511655.el7.x86_64                                                                        7/10

        Verifying  : hsa-ext-amdgpu-pro-image-1.1.6-511655.el7.x86_64                                                                8/10

        Verifying  : hsa-runtime-tools-amdgpu-pro-1.1.6-511655.el7.x86_64                                                            9/10

        Verifying  : amdgpu-core-17.50-511655.el7.noarch                                                                            10/10

       

       

      Installed:

        rocm-amdgpu-pro.x86_64 0:17.50-511655.el7                                                                                     

       

       

      Dependency Installed:

        amdgpu-core.noarch 0:17.50-511655.el7                              amdgpu-pro-core.noarch 0:17.50-511655.el7                 

        hsa-ext-amdgpu-pro-finalize.x86_64 0:1.1.6-511655.el7              hsa-ext-amdgpu-pro-image.x86_64 0:1.1.6-511655.el7         

        hsa-runtime-tools-amdgpu-pro.x86_64 0:1.1.6-511655.el7            rocm-amdgpu-pro-icd.x86_64 0:17.50-511655.el7             

        rocm-amdgpu-pro-opencl.x86_64 0:17.50-511655.el7                  rocr-amdgpu-pro.x86_64 0:1.1.6-511655.el7                 

        roct-amdgpu-pro.x86_64 0:1.0.7-511655.el7                     

       

       

      Complete!

      [root@grumpy ~/ati/amdgpu-pro-17.50-511655]#

       

      This doesn't seem to install the kernel driver at all, so:

       

      [root@grumpy ~/ati/amdgpu-pro-17.50-511655]# yum install amdgpu-dkms

      [...]

      ====================================================================================================================================

      Package                      Arch                    Version                            Repository                          Size

      ====================================================================================================================================

      Installing:

      amdgpu-dkms                  noarch                  17.50-511655.el7                  amdgpu-pro-local                  7.1 M

       

       

      Transaction Summary

      ====================================================================================================================================

      Install  1 Package

       

       

      Total download size: 7.1 M

      Installed size: 7.1 M

      Is this ok [y/d/N]: y

      Downloading packages:

      Running transaction check

      Running transaction test

      Transaction test succeeded

      Running transaction

        Installing : amdgpu-dkms-17.50-511655.el7.noarch                                                                              1/1

      Loading new amdgpu-17.50-511655.el7 DKMS files...

      dpkg: warning: version '3.10.0-693.17.1.el7.x86_64' has bad syntax: invalid character in revision number

      dpkg: warning: version '3.10.0-693.17.1.el7.x86_64' has bad syntax: invalid character in revision number

      dpkg: warning: version '4.9.81-1.el7.centos.x86_64' has bad syntax: invalid character in revision number

      dpkg: warning: version '3.10.0-693.17.1.el7.x86_64' has bad syntax: invalid character in revision number

      Building for 3.10.0-693.17.1.el7.x86_64 4.9.81-1.el7.centos.x86_64

      Building initial module for 3.10.0-693.17.1.el7.x86_64

      Done.

      Forcing installation of amdgpu

       

       

      amdgpu:

      Running module version sanity check.

      - Original module

        - No original module exists within this kernel

      - Installation

        - Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/

       

       

      amdttm.ko:

      Running module version sanity check.

      - Original module

        - No original module exists within this kernel

      - Installation

        - Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/

       

       

      amdkcl.ko:

      Running module version sanity check.

      - Original module

        - No original module exists within this kernel

      - Installation

        - Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/

       

       

      amdkfd.ko:

      Running module version sanity check.

      - Original module

        - No original module exists within this kernel

      - Installation

        - Installing to /lib/modules/3.10.0-693.17.1.el7.x86_64/extra/

      Adding any weak-modules

       

       

      depmod....

       

       

      Backing up initramfs-3.10.0-693.17.1.el7.x86_64.img to /boot/initramfs-3.10.0-693.17.1.el7.x86_64.img.old-dkms

      Making new initramfs-3.10.0-693.17.1.el7.x86_64.img

      (If next boot fails, revert to initramfs-3.10.0-693.17.1.el7.x86_64.img.old-dkms image)

      dracut.......

       

       

      DKMS: install completed.

      Building initial module for 4.9.81-1.el7.centos.x86_64

      Error! Bad return status for module build on kernel: 4.9.81-1.el7.centos.x86_64 (x86_64)

      Consult /var/lib/dkms/amdgpu/17.50-511655.el7/build/make.log for more information.

      warning: %post(amdgpu-dkms-0:17.50-511655.el7.noarch) scriptlet failed, exit status 10

      Non-fatal POSTIN scriptlet failure in rpm package amdgpu-dkms-17.50-511655.el7.noarch

        Verifying  : amdgpu-dkms-17.50-511655.el7.noarch                                                                              1/1

       

       

      Installed:

        amdgpu-dkms.noarch 0:17.50-511655.el7                                                                                         

       

       

      Complete!

       

      [root@grumpy ~/ati/amdgpu-pro-17.50-511655]# lsmod | grep amdgpu

      [root@grumpy ~/ati/amdgpu-pro-17.50-511655]# modprobe amdgpu

      [root@grumpy ~/ati/amdgpu-pro-17.50-511655]# lsmod | grep amdgpu

      amdgpu              3143876  2

      amdttm                110970  1 amdgpu

      amdkcl                24897  3 amdgpu,amdkfd,amdttm

      i2c_algo_bit          13413  2 i915,amdgpu

      drm_kms_helper        159169  3 i915,amdgpu,nvidia_drm

      drm                  370825  15 i915,drm_kms_helper,amdgpu,amdkcl,amdttm,nvidia_drm

      i2c_core              40756  8 drm,i915,i2c_i801,i2c_hid,drm_kms_helper,i2c_algo_bit,amdgpu,nvidia

       

      At this point there is no clinfo command available:

       

      [root@grumpy ~]# yum install clinfo

      [...]

      ====================================================================================================================================

      Package                    Arch                        Version                                    Repository                Size

      ====================================================================================================================================

      Installing:

      clinfo                      x86_64                      2.1.17.02.09-1.el7                        epel                      39 k

       

       

      Transaction Summary

      ====================================================================================================================================

      Install  1 Package

       

       

      Total download size: 39 k

      Installed size: 83 k

      Is this ok [y/d/N]: y

      Downloading packages:

      clinfo-2.1.17.02.09-1.el7.x86_64.rpm                                                                        |  39 kB  00:00:00 

      Running transaction check

      Running transaction test

      Transaction test succeeded

      Running transaction

        Installing : clinfo-2.1.17.02.09-1.el7.x86_64                                                                                1/1

        Verifying  : clinfo-2.1.17.02.09-1.el7.x86_64                                                                                1/1

       

       

      Installed:

        clinfo.x86_64 0:2.1.17.02.09-1.el7                                                                                             

       

       

      Complete!

      [root@grumpy ~]# clinfo

      Number of platforms                              0

       

      [root@grumpy ~]# LD_LIBRARY_PATH=/opt/amdgpu-pro/lib64 /usr/local/bin/ethminer --list-devices

        ✘  11:41:11|ethminer  No OpenCL platforms found

       

       

      Listing CUDA devices.

      FORMAT: [deviceID] deviceName

      [0] GeForce GTX 1070 Ti

      Compute version: 6.1

      cudaDeviceProp::totalGlobalMem: 8508145664

      [1] GeForce GTX 980 Ti

      Compute version: 5.2

      cudaDeviceProp::totalGlobalMem: 6373572608

      [2] GeForce GTX 1070 Ti

      Compute version: 6.1

      cudaDeviceProp::totalGlobalMem: 8508145664

      [3] GeForce GTX 1070

      Compute version: 6.1

      cudaDeviceProp::totalGlobalMem: 8508145664

      [4] GeForce GTX 1070 Ti

      Compute version: 6.1

      cudaDeviceProp::totalGlobalMem: 8508145664

      [5] GeForce GTX 1070 Ti

      Compute version: 6.1

      cudaDeviceProp::totalGlobalMem: 8508145664

       

      [root@grumpy ~]# yum install clinfo-amdgpu-pro-17.50-511655.el7.x86_64

      [...]

      ====================================================================================================================================

      Package                                Arch                Version                          Repository                      Size

      ====================================================================================================================================

      Installing:

      clinfo-amdgpu-pro                      x86_64              17.50-511655.el7                  amdgpu-pro-local              198 k

      Installing for dependencies:

      libopencl-amdgpu-pro                  x86_64              17.50-511655.el7                  amdgpu-pro-local                11 k

      libopencl-amdgpu-pro-icd              x86_64              17.50-511655.el7                  amdgpu-pro-local                29 M

       

       

      Transaction Summary

      ====================================================================================================================================

      Install  1 Package (+2 Dependent packages)

       

       

      Total download size: 29 M

      Installed size: 29 M

      Is this ok [y/d/N]: y

      Downloading packages:

      ------------------------------------------------------------------------------------------------------------------------------------

      Total                                                                                              345 MB/s |  29 MB  00:00:00 

      Running transaction check

      Running transaction test

      Transaction test succeeded

      Running transaction

        Installing : libopencl-amdgpu-pro-17.50-511655.el7.x86_64                                                                    1/3

        Installing : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64                                                                2/3

        Installing : clinfo-amdgpu-pro-17.50-511655.el7.x86_64                                                                        3/3

        Verifying  : libopencl-amdgpu-pro-17.50-511655.el7.x86_64                                                                    1/3

        Verifying  : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64                                                                2/3

        Verifying  : clinfo-amdgpu-pro-17.50-511655.el7.x86_64                                                                        3/3

       

       

      Installed:

        clinfo-amdgpu-pro.x86_64 0:17.50-511655.el7                                                                                   

       

       

      Dependency Installed:

        libopencl-amdgpu-pro.x86_64 0:17.50-511655.el7                libopencl-amdgpu-pro-icd.x86_64 0:17.50-511655.el7             

       

       

      Complete!

       

      Once that is install everything just outright segfaults.

       

      [root@grumpy ~]# clinfo

      Segmentation fault

       

      [root@grumpy ~]# /opt/amdgpu-pro/bin/clinfo

      Segmentation fault

       

      [root@grumpy ~]# LD_LIBRARY_PATH=/opt/amdgpu-pro/lib64 /usr/local/bin/ethminer --list-devices

      Segmentation fault

       

      If I remove all the Nvidia OpenCL libraries and re-run ldconfig, everything still segfaults.

       

      If I remove the amdgpu-pro opencl libraries, there is no libOpenCL.so so nothing finds it (I removed the Nvidia one earlier):

       

      [root@grumpy ~]# yum remove clinfo-amdgpu-pro-17.50-511655.el7.x86_64 libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64 libopencl-amdgpu-pro-17.50-511655.el7.x86_64

       

      [...]

      ====================================================================================================================================

      Package                                Arch                 Version                          Repository                       Size

      ====================================================================================================================================

      Removing:

      clinfo-amdgpu-pro                      x86_64               17.50-511655.el7                 @amdgpu-pro-local               780 k

      libopencl-amdgpu-pro                   x86_64               17.50-511655.el7                 @amdgpu-pro-local                27 k

      libopencl-amdgpu-pro-icd               x86_64               17.50-511655.el7                 @amdgpu-pro-local               102 M

       

       

      Transaction Summary

      ====================================================================================================================================

      Remove  3 Packages

       

       

      Installed size: 103 M

      Is this ok [y/N]: y

      Downloading packages:

      Running transaction check

      Running transaction test

      Transaction test succeeded

      Running transaction

        Erasing    : clinfo-amdgpu-pro-17.50-511655.el7.x86_64                                                                        1/3

        Erasing    : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64                                                                 2/3

        Erasing    : libopencl-amdgpu-pro-17.50-511655.el7.x86_64                                                                     3/3

        Verifying  : libopencl-amdgpu-pro-17.50-511655.el7.x86_64                                                                     1/3

        Verifying  : libopencl-amdgpu-pro-icd-17.50-511655.el7.x86_64                                                                 2/3

        Verifying  : clinfo-amdgpu-pro-17.50-511655.el7.x86_64                                                                        3/3

       

       

      Removed:

        clinfo-amdgpu-pro.x86_64 0:17.50-511655.el7                        libopencl-amdgpu-pro.x86_64 0:17.50-511655.el7              

        libopencl-amdgpu-pro-icd.x86_64 0:17.50-511655.el7              

       

       

      Complete!

      [root@grumpy ~]# clinfo

      clinfo: error while loading shared libraries: libOpenCL.so.1: cannot open shared object file: No such file or directory

       

       

      So which is the correct libOpenCL to use, and what package does it come from? The only one that ships with the driver packages results in nothing but segfaults.

      At this point I'm reasonably sure this has nothing to do with interference from Nvidia drivers and libraries.

       

      Note: I deleted the amdgpu kernel driver that ships with the kernel so that it wouldn't clash with the one built using DKMS from the driver bundle.

        • Re: Cannot Get OpenCL on Linux to Work At All
          powerload

          Just tried it on a completely different machine, with no Nvidia cards or drivers on it, and the results are the same.

          distro clinfo reports 0 platforms.

          amdgpu-pro clinfo crashes out.

           

          How on earth can there be such a shortage of AMD GPUs when the software stack just doesn't work?

            • Re: Cannot Get OpenCL on Linux to Work At All
              ben-and-ellen

              I'm using RX 580 and a GTX 1070 on Ubuntu and trying to get opencl to work. I'm also trying to understand why I get a "AMD ADL library not found." message. Maybe something in here will help...

               

              After messing about for a few weeks with different drivers I did a fresh install today:

              Ubuntu 16.04.03

              $ uname -r

              4.4.0-116-generic kernel

              Then updated with

              sudo apt-get update && sudo apt-get -y dist-upgrade

               

              I've got a RX 580. To install I installed Ubuntu without graphics card using the onboard Intel chip. Then installed RX 580 into the PCIE 16 slot and attached the monitor to it.

              Installed drivers with:

              ./amdgpu-install --opencl=legacy

              $ethminer --list-devices

              Listing OpenCL devices.

              FORMAT: [platformID] [deviceID] deviceName

              [0] [0] GeForce GTX 1070 Ti

                      CL_DEVICE_TYPE: GPU

                      CL_DEVICE_GLOBAL_MEM_SIZE: 8513978368

                      CL_DEVICE_MAX_MEM_ALLOC_SIZE: 2128494592

                      CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024

              [0] [1] GeForce GTX 1070 Ti

                      CL_DEVICE_TYPE: GPU

                      CL_DEVICE_GLOBAL_MEM_SIZE: 8513978368

                      CL_DEVICE_MAX_MEM_ALLOC_SIZE: 2128494592

                      CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024

              [1] [0] Ellesmere

                      CL_DEVICE_TYPE: GPU

                      CL_DEVICE_GLOBAL_MEM_SIZE: 5877137408

                      CL_DEVICE_MAX_MEM_ALLOC_SIZE: 4244635648

                      CL_DEVICE_MAX_WORK_GROUP_SIZE: 256

              Listing CUDA devices.

              FORMAT: [deviceID] deviceName

              [0] GeForce GTX 1070 Ti

                      Compute version: 6.1

                      cudaDeviceProp::totalGlobalMem: 8513978368

                      Pci: 0000:0c:00

              [1] GeForce GTX 1070 Ti

                      Compute version: 6.1

                      cudaDeviceProp::totalGlobalMem: 8513978368

                      Pci: 0000:0e:00

               

              Although this shows the devices that are opencl capable, I don't think that the opencl is being used.

              clinfo is unavailable at this point in the installation.