9 Replies Latest reply on Jan 7, 2018 7:07 PM by markswanson

    Fixes for AMD installer (Ubuntu 16.04 x86_64 dec 30,2017)

    markswanson

      I just did a fresh install of Ubuntu 16.04 to a new partition, and did an 'apt upgrade ; apt dist-upgrade' to make sure it was perfectly up to date.

       

      After installation finishes, are these steps correct to get AMD-APP-SDK working?

       

      1. install amdgpu

      2. install AMD-APP-SDK-v3*

       

      At this stage, several things are missing:

      * LD_LIBRARY_PATH is missing /opt/AMDAPPSDK-3.0/lib/x86_64/

      * The dir (/opt/AMDAPPSDK-3.0/lib/x86_64/) contains an incorrect symbolic link (libOpenCL.so incorrectly points to /usr/lib/libOpenCL.so.1 [which doesn't exist]) which can be fixed like:

        $ rm libOpenCL.so

        $ ln -sf sdk/libOpenCL.so.1 libOpenCL.so

        $ ln -sf sdk/libOpenCL.so.1 libOpenCL.so.1

       

      However, 'clinfo' fails with:

      clinfo

      terminate called after throwing an instance of 'cl::Error'

        what():  clGetPlatformIDs

      Aborted (core dumped)

       

      What did I miss or do wrong? I'm trying to get 'clinfo' working.

       

      Thanks.

        • Re: Fixes for AMD installer (Ubuntu 16.04 x86_64 dec 30,2017)
          markswanson

          I'd just like to add that strace revealed that clinfo was trying to find libamdocl64.so, but that needs another LD_LIBRARY_PATH change:

           

          export LD_LIBRARY_PATH=/opt/AMDAPPSDK-3.0/lib/x86_64/:/opt/AMDAPPSDK-3.0/lib/x86_64/sdk

           

          clinfo works now.

          Also, xmr-stak gets a little further: Found AMD platform index id = 0, name = Advanced Micro Devices, Inc.

          Great!

           

          However, the next log line that xmr-stak prints is:

          WARNING: CL_DEVICE_NOT_FOUND when calling clGetDeviceIDs for of devices.

           

          :-(

            • Re: Fixes for AMD installer (Ubuntu 16.04 x86_64 dec 30,2017)
              markswanson

              Strange, I don't think that clinfo (or the AMD libraries) are actually finding my AMD RX 580.

               

              lspci -v shows:

              04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev e7) (prog-if 00 [VGA controller])

                  Subsystem: Micro-Star International Co., Ltd. [MSI] Device 3418

                  Flags: bus master, fast devsel, latency 0, IRQ 30

                  Memory at d0000000 (64-bit, prefetchable) [size=256M]

                  Memory at cfe00000 (64-bit, prefetchable) [size=2M]

                  I/O ports at e000 [size=256]

                  Memory at feb80000 (32-bit, non-prefetchable) [size=256K]

                  Expansion ROM at febc0000 [disabled] [size=128K]

                  Capabilities: <access denied>

                  Kernel driver in use: amdgpu

                  Kernel modules: amdgpu

               

              But strangely clinfo reports that the device is an Intel CPU ??

               

                Name:     Intel(R) Core(TM)2 Duo CPU E7500  @ 2.93GHz

               

              $ clinfo

              Number of platforms:                 1

                Platform Profile:                 FULL_PROFILE

                Platform Version:                 OpenCL 2.0 AMD-APP (1800.8)

                Platform Name:                 AMD Accelerated Parallel Processing

                Platform Vendor:                 Advanced Micro Devices, Inc.

                Platform Extensions:                 cl_khr_icd cl_amd_event_callback cl_amd_offline_devices

               

               

                Platform Name:                 AMD Accelerated Parallel Processing

              Number of devices:                 1

                Device Type:                     CL_DEVICE_TYPE_CPU

                Vendor ID:                     1002h

                Board name:

                Max compute units:                 2

                Max work items dimensions:             3

                  Max work items[0]:                 1024

                  Max work items[1]:                 1024

                  Max work items[2]:                 1024

                Max work group size:                 1024

                Preferred vector width char:             16

                Preferred vector width short:             8

                Preferred vector width int:             4

                Preferred vector width long:             2

                Preferred vector width float:             4

                Preferred vector width double:         2

                Native vector width char:             16

                Native vector width short:             8

                Native vector width int:             4

                Native vector width long:             2

                Native vector width float:             4

                Native vector width double:             2

                Max clock frequency:                 1603Mhz

                Address bits:                     64

                Max memory allocation:             2087362560

                Image support:                 Yes

                Max number of images read arguments:         128

                Max number of images write arguments:         64

                Max image 2D width:                 8192

                Max image 2D height:                 8192

                Max image 3D width:                 2048

                Max image 3D height:                 2048

                Max image 3D depth:                 2048

                Max samplers within kernel:             16

                Max size of kernel argument:             4096

                Alignment (bits) of base address:         1024

                Minimum alignment (bytes) for any datatype:     128

                Single precision floating point capability

                  Denorms:                     Yes

                  Quiet NaNs:                     Yes

                  Round to nearest even:             Yes

                  Round to zero:                 Yes

                  Round to +ve and infinity:             Yes

                  IEEE754-2008 fused multiply-add:         Yes

                Cache type:                     Read/Write

                Cache line size:                 64

                Cache size:                     32768

                Global memory size:                 2087362560

                Constant buffer size:                 65536

                Max number of constant args:             8

                Local memory type:                 Global

                Local memory size:                 32768

                Max pipe arguments:                 16

                Max pipe active reservations:             16

                Max pipe packet size:                 2087362560

                Max global variable size:             1879048192

                Max global variable preferred total size:     1879048192

                Max read/write image args:             64

                Max on device events:                 0

                Queue on device max size:             0

                Max on device queues:                 0

                Queue on device preferred size:         0

                SVM capabilities:

                  Coarse grain buffer:             No

                  Fine grain buffer:                 No

                  Fine grain system:                 No

                  Atomics:                     No

                Preferred platform atomic alignment:         0

                Preferred global atomic alignment:         0

                Preferred local atomic alignment:         0

                Kernel Preferred work group size multiple:     1

                Error correction support:             0

                Unified memory for Host and Device:         1

                Profiling timer resolution:             1

                Device endianess:                 Little

                Available:                     Yes

                Compiler available:                 Yes

                Execution capabilities:

                  Execute OpenCL kernels:             Yes

                  Execute native function:             Yes

                Queue on Host properties:

                  Out-of-Order:                 No

                  Profiling :                     Yes

                Queue on Device properties:

                  Out-of-Order:                 No

                  Profiling :                     No

                Platform ID:                     0x7fb886653430

                Name:                         Intel(R) Core(TM)2 Duo CPU     E7500  @ 2.93GHz

                Vendor:                     GenuineIntel

                Device OpenCL C version:             OpenCL C 1.2

                Driver version:                 1800.8 (sse2)

                Profile:                     FULL_PROFILE

                Version:                     OpenCL 1.2 AMD-APP (1800.8)

                Extensions:                     cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_spir cl_khr_gl_event

               

               

              Are the AMD drivers finding the RX 580 at all?

            • Re: Fixes for AMD installer (Ubuntu 16.04 x86_64 dec 30,2017)
              markswanson

              I'd just like to add a few things:

               

              I've read that the 'pro' driver includes OpenCL. I thought that perhaps that's why my RX 580 isn't being found via 'clinfo'.

              I downloaded and installed the AMDGPU-Pro Beta Mining Driver (17.40-483984) (using amdgpu-pro-install -y).

              The mining driver seems to be initialized fine (dmesg, lsmod -v). However, clinfo still doesn't show the RX 580 in its list of OpenCL devices and no mining software finds the RX 580.

               

              I'm not sure what to try next.

              My system is using an Intel GPU for X11. My plan was to have the RX 580 just sit there and mine (not connected to any monitor).

              Are there any BIOS settings at play?

              Any and all suggestions warmly welcomed.

               

              Thanks!

              • Re: Fixes for AMD installer (Ubuntu 16.04 x86_64 dec 30,2017)
                markswanson

                It seems that neither of the AMD-PRO drivers work with X11 - they both segfault like this:

                 

                [7.177] (EE) Backtrace:
                [7.177] (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e) [0x561a492ad6ce]
                [7.177] (EE) 1: /usr/lib/xorg/Xorg (0x561a490fb000+0x1b6a69) [0x561a492b1a69]
                [7.177] (EE) 2: /lib/x86_64-linux-gnu/libc.so.6 (0x7f1dcf7c6000+0x354b0) [0x7f1dcf7fb4b0]
                [7.177] (EE) 3: /opt/amdgpu/lib/xorg/modules/libglamoregl.so (glamor_init+0x168) [0x7f1dcc14f338]
                [7.177] (EE) 4: /usr/lib/xorg/modules/drivers/modesetting_drv.so (0x7f1dcd18d000+0x7860) [0x7f1dcd194860]
                [7.177] (EE) 5: /usr/lib/xorg/Xorg (AddGPUScreen+0x11a) [0x561a4914f2da]
                [7.177] (EE) 6: /usr/lib/xorg/Xorg (InitOutput+0x289) [0x561a491933b9]
                [7.177] (EE) 7: /usr/lib/xorg/Xorg (0x561a490fb000+0x57c24) [0x561a49152c24]
                [7.177] (EE) 8: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf0) [0x7f1dcf7e6830]
                [7.177] (EE) 9: /usr/lib/xorg/Xorg (_start+0x29) [0x561a4913d069]
                [7.177] (EE)
                [7.177] (EE) Segmentation fault at address 0x561a4929b3b0
                [7.177] (EE)

                Fatal server error:

                [

                7.177] (EE) Caught signal 11 (Segmentation fault). Server aborting

                 

                I also see an ABI mismatch in the X11 logs (causing a segfault):

                 

                [ 6.761] (II) Module amdgpu: vendor="X.Org Foundation"
                [ 6.761]compiled for 1.19.3, module version = 1.4.0
                [ 6.761]Module class: X.Org Video Driver
                [ 6.761]ABI class: X.Org Video Driver, version 23.0
                [ 6.761] (EE) module ABI major version (23) doesn't match the server's version (20)
                [ 6.761] (II) UnloadModule: "amdgpu"
                [ 6.761] (II) Unloading amdgpu
                [

                6.761] (EE) Failed to load module "amdgpu" (module requirement mismatch, 0)

                 

                I've uninstalled amdgpu-pro and the AMD-APP-SDK, then re-installed them - just in case my previous install of the normal (NOT PRO) amdgpu drivers had some lingering effect.

                No change:

                * clinfo still aborts unless you manually provide your own LD_LIBRARY_PATH

                * clinfo still fails to find the RX 580

                 

                Worse:

                * X11 no longer works. I get:

                root  1209  0.0  2.1 378828 43480 ?   

                Sl   12:12   0:00 zenity --warning --text <big><b>The system is running in low-graphics mode</b></big>\n\nYour screen, graphics card, and input device settings\ncould not be detected correctly.  You will need t

                 

                It doesn't matter if I select the PCI Express video card as the primary video card in the BIOS. The Intel graphics chip is no longer used.

                In the failsafe X11 logs I see another ABI major version mismatch:

                 

                [14.662] (II) LoadModule: "glx"
                [14.662] (II) Loading /opt/amdgpu-pro/lib/xorg/modules/extensions/libglx.so
                [14.663] (II) Module glx: vendor="X.Org Foundation"
                [14.663]compiled for 1.19.0, module version = 1.0.0
                [14.663]ABI class: X.Org Server Extension, version 10.0
                [14.663] (EE) module ABI major version (10) doesn't match the server's version (9)
                [14.663] (II) UnloadModule: "glx"
                [14.663] (II) Unloading glx
                [

                14.664] (EE) Failed to load module "glx" (module requirement mismatch, 0)

                 

                fbdev seems to fail with (this EE error is printed dozens of times):

                [14.752] (EE) FBDEV(0): FBIOPUTCMAP: Invalid argument

                 

                So maybe that's why X11 no longer works at all (not with amd or intel graphics).

                 

                Again, I'm using the latest dist-upgrade of Ubuntu 16.04.

                 

                I'm happy to try anything. I'll reformat, re-install, even purchase another motherboard.

                 

                Thanks!

                • Re: Fixes for AMD installer (Ubuntu 16.04 x86_64 dec 30,2017)
                  markswanson

                  I'm curious if the AMD driver needs to be running under X11 for OpenCL to work?

                   

                  I found an OpenCL howto:

                  OpenCLHowTo - Andreas Klöckner's wiki

                   

                  and it says "the X server has to be running, and the OpenCL code has to have access to it.".

                   

                  Can anyone here confirm this? It seems weird to me (as an OpenCL newb) that an X server is in the picture at all...

                   

                  Fwiw I compiled the OpenCL howto demo and it also couldn't find the RX 580 as an OpenCL device.

                  (again, maybe the X11 ABI major version mismatch is the root culprit?)

                  • Re: Fixes for AMD installer (Ubuntu 16.04 x86_64 dec 30,2017)
                    markswanson

                    Just a FYI - the AMD Vulkan demos build and run fine. A snippet from the vulkaninfo example I compiled shows it finds the RX 580 just fine:

                     

                    VK_LAYER_LUNARG_core_validation (LunarG Validation Layer) Vulkan version 1.0.65, layer version 1

                        Layer Extensions    count = 1

                            VK_EXT_debug_report                 : extension revision  6

                        Devices     count = 1

                            GPU id       : 0 (Radeon RX 580 Series)

                            Layer-Device Extensions count = 1

                                VK_EXT_debug_marker                 : extension revision  4

                     

                    I suspect my main OpenCL problem stems from an Ubuntu update for X11 and the AMD drivers haven't caught up yet (ABI major version bump).

                    Too bad no monero miners work with Vulkan :-)

                    • Re: Fixes for AMD installer (Ubuntu 16.04 x86_64 dec 30,2017)
                      markswanson

                      Another FYI - I was able to use a Mesa OpenCL ICD package to move forward (or move sideways, I can't tell)

                       

                      sudo apt install mesa-opencl-icd

                       

                      clinfo now finds the RX 580.

                      Also, some small C code (tools-master print-devices) shows:

                      platform 1: vendor 'Mesa'

                        device 0: 'AMD POLARIS10 (DRM 3.20.0 / 4.4.0-104-generic, LLVM 4.0.0)'

                       

                      So... the problem is that only OpenCL 1.1 is supported and I seem to need OpenCL 1.2 for xmr-stak.

                      Also, xmr-stak is complaining that the platform is not 'AMD' (it's Mesa). Not sure if that's why it's still crashing (I suspect missing OpenCL 1.2 support)

                      • Re: Fixes for AMD installer (Ubuntu 16.04 x86_64 dec 30,2017)
                        markswanson

                        Ok, my 'fixes' aren't required at all. I documented the steps required to get things working here:

                         

                          https://community.amd.com/message/2840821#comment-2840821

                        (my last post on Jan 3, 2018)

                         

                        It's all working perfectly thanks to help from Dipak.