cancel
Showing results for 
Search instead for 
Did you mean: 

PC Drivers & Software

makeitwork
Adept II

Radeon VII NOT recognized in clinfo OpenCL, cannot run compute jobs, but RX 580 is - Linux Ubuntu amdgpu-pro driver

Alright so I've been running two RX 580 8GB GPUs for over a year now using Ubuntu 18 and the amdgpu-pro series driver with OpenCL support and had no major problems aside from some compiling issues for the driver/kernel a while back (which was fixed in the updated support for Ubuntu HWE on amdgpu-pro 20.10).

Now, I've upgraded to a Radeon VII (vega 20 firmware) and have run into a slight issue.  It may be related to the fact that I didn't uninstall the proprietary driver first before installing the card.  The graphics are absolutely wonderful and it works out of the box, but my Radeon VII is not detected for OpenCL/compute jobs.  That's the primary reason I bought the card, and I've been unable to find any real answers for this problem using a variety of search engines and search terms.  One of the RX 580 GPUs that I left installed is still detected!  So I can use my CPU and my old GPU for opencl-enabled programs like hashcat, boinc, blender, etc. but the new Radeon VII isn't detected for these programs at all.  Everywhere I look on the Internet where someone hasn't been able to use OpenCL on a Radeon VII is told to install a few firmware files and install the proprietary driver, which is what I've done.

Things I've tried:

Uninstalling the driver, reboot

Added vega20 firmware files to /lib/firmware/amdgpu/ - then update-initramfs -k all -u and rebooted

Reinstalled the driver, rebooted, still no success - graphics work but not showing up in clinfo or hashcat -b to benchmark

Tried finding any setting in the BIOS that would make any difference, no success, running the same settings as always which are defaults with some fan tweaks to increase fan speeds on the case fans

Command I'm using to install the driver:  ./amdgpu-pro-install --opencl=pal,legacy

Things I've considered, but *not* tried:
Re-seating the card (shouldn't make a difference, should it?)

Diagnostic output:

uname -r

5.3.0-46-generic

sudo lshw -c video

  *-display                 
       description: VGA compatible controller
       product: Vega 20
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci@0000:0c:00.0
       version: c1
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
       configuration: driver=amdgpu latency=0
       resources: irq:89 memory:e0000000-efffffff memory:f0000000-f01fffff ioport:e000(size=256) memory:fcb00000-fcb7ffff memory:c0000-dffff
  *-display
       description: VGA compatible controller
       product: Ellesmere [Radeon RX 470/480/570/570X/580/580X]
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci@0000:0d:00.0
       version: e7
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
       configuration: driver=amdgpu latency=0
       resources: irq:91 memory:c0000000-cfffffff memory:d0000000-d01fffff ioport:d000(size=256) memory:fce00000-fce3ffff memory:fce40000-fce5ffff

sudo lspci | grep VGA

0c:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 (rev c1)
0d:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev e7)

sudo clinfo

Number of platforms                               2
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3075.10)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
  Platform Host timer resolution                  1ns
  Platform Extensions function suffix             AMD

  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 1.2 pocl 1.1 None+Asserts, LLVM 6.0.0, SPIR, SLEEF, DISTRO, POCL_DEBUG
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     Ellesmere
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.2 AMD-APP (3075.10)
  Driver Version                                  3075.10
  Device OpenCL C Version                         OpenCL C 1.2
  Device Type                                     GPU
  Device Board Name (AMD)                         Radeon RX 580 Series
  Device Topology (AMD)                           PCI-E, 0d:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               36
  SIMD per compute unit (AMD)                     4
  SIMD width (AMD)                                16
  SIMD instruction width (AMD)                    1
  Max clock frequency                             1360MHz
  Graphics IP (AMD)                               8.0
  Device Partition                                (core)
    Max number of sub-devices                     36
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             256
  Preferred work group size (AMD)                 256
  Max work group size (AMD)                       1024
  Preferred work group size multiple              64
  Wavefront width (AMD)                           64
  Preferred / native vector sizes                 
    char                                                 4 / 4       
    short                                                2 / 2       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 1 / 1        (cl_khr_fp16)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     No
    Infinity and NANs                             No
    Round to nearest                              No
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              4432814080 (4.128GiB)
  Global free memory (AMD)                        4308312 (4.109GiB)
  Global memory channels (AMD)                    8
  Global memory banks per channel (AMD)           16
  Global memory bank width (AMD)                  256 bytes
  Error Correction support                        No
  Max memory allocation                           3551587123 (3.308GiB)
  Unified memory for Host and Device              No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       2048 bits (256 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        16384 (16KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   256 bytes
    Pitch alignment for 2D image buffers          256 pixels
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                8
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Local memory syze per CU (AMD)                  65536 (64KiB)
  Local memory banks (AMD)                        32
  Max number of constant args                     8
  Max constant buffer size                        3551587123 (3.308GiB)
  Preferred constant buffer size (AMD)            16384 (16KiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Profiling timer offset since Epoch (AMD)        1587704465991140729ns (Thu Apr 23 23:01:05 2020)
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Thread trace supported (AMD)                  Yes
    Number of async queues (AMD)                  2
    Max real-time compute queues (AMD)            0
    Max real-time compute units (AMD)             3187338544
    SPIR versions                                 1.2
  printf() buffer size                            4194304 (4MiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event

  Platform Name                                   Portable Computing Language
Number of devices                                 1
  Device Name                                     pthread-AMD Ryzen 7 2700X Eight-Core Processor
  Device Vendor                                   AuthenticAMD
  Device Vendor ID                                0x1022
  Device Version                                  OpenCL 1.2 pocl HSTR: pthread-x86_64-pc-linux-gnu-znver1
  Driver Version                                  1.1
  Device OpenCL C Version                         OpenCL C 1.2 pocl
  Device Type                                     CPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               16
  Max clock frequency                             3700MHz
  Device Partition                                (core)
    Max number of sub-devices                     16
    Supported partition types                     equally, by counts
  Max work item dimensions                        3
  Max work item sizes                             4096x4096x4096
  Max work group size                             4096
  Preferred work group size multiple              8
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                               16 / 16      
    int                                                  8 / 8       
    long                                                 4 / 4       
    half                                                 0 / 0        (n/a)
    float                                                8 / 8       
    double                                               4 / 4        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              65265414144 (60.78GiB)
  Error Correction support                        No
  Max memory allocation                           17179869184 (16GiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        8388608 (8MiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            1073741824 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             32768x32768 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                128
  Local memory type                               Global
  Local memory size                               4194304 (4MiB)
  Max number of constant args                     8
  Max constant buffer size                        4194304 (4MiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
    SPIR versions                                 1.2
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_spir cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [AMD]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   Ellesmere
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   Ellesmere
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   Ellesmere

relevent output from
sudo journalctl | grep amd

kernel: Linux version 5.3.0-46-generic (buildd@lcy01-amd64-013) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #38~18.04.1-Ubuntu SMP Tue Mar 31 04:17:56 UTC 2020 (Ubuntu 5.3.0-46.38~18.04.1-generic 5.3.18)
kernel: amd_uncore: AMD NB counters detected
kernel: amd_uncore: AMD LLC counters detected
kernel: perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
kernel: amdkcl: loading out-of-tree module taints kernel.
kernel: amdkcl: loading out-of-tree module taints kernel.
kernel: amdkcl: module verification failed: signature and/or required key missing - tainting kernel
kernel: [drm] amdgpu kernel modesetting enabled.
kernel: [drm] amdgpu version: 5.4.7.20.10
kernel: amdgpu 0000:0c:00.0: remove_conflicting_pci_framebuffers: bar 0: 0xe0000000 -> 0xefffffff
kernel: amdgpu 0000:0c:00.0: remove_conflicting_pci_framebuffers: bar 2: 0xf0000000 -> 0xf01fffff
kernel: amdgpu 0000:0c:00.0: remove_conflicting_pci_framebuffers: bar 5: 0xfcb00000 -> 0xfcb7ffff
kernel: fb0: switching to amdgpudrmfb from VESA VGA
kernel: amdgpu 0000:0c:00.0: vgaarb: deactivate vga console
kernel: amdgpu 0000:0c:00.0: No more image in the PCI ROM
kernel: amdgpu 0000:0c:00.0: VRAM: 16368M 0x0000008000000000 - 0x00000083FEFFFFFF (16368M used)
kernel: amdgpu 0000:0c:00.0: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
kernel: amdgpu 0000:0c:00.0: AGP: 267894784M 0x0000008400000000 - 0x0000FFFFFFFFFFFF
kernel: [drm] amdgpu: 16368M of VRAM memory ready
kernel: [drm] amdgpu: 16368M of GTT memory ready.
kernel: amdgpu: [powerplay] hwmgr_sw_init smu backed is vega20_smu
kernel: amdgpu 0000:0c:00.0: HDCP: hdcp ta ucode is not available
kernel: amdgpu 0000:0c:00.0: DTM: dtm ta ucode is not available
kernel: fbcon: amdgpudrmfb (fb0) is primary device
kernel: amdgpu 0000:0c:00.0: fb0: amdgpudrmfb frame buffer device
kernel: amdgpu 0000:0c:00.0: ring gfx uses VM inv eng 0 on hub 0
kernel: amdgpu 0000:0c:00.0: ring comp_1.0.0 uses VM inv eng 1 on hub 0
kernel: amdgpu 0000:0c:00.0: ring comp_1.1.0 uses VM inv eng 4 on hub 0
kernel: amdgpu 0000:0c:00.0: ring comp_1.2.0 uses VM inv eng 5 on hub 0
kernel: amdgpu 0000:0c:00.0: ring comp_1.3.0 uses VM inv eng 6 on hub 0
kernel: amdgpu 0000:0c:00.0: ring comp_1.0.1 uses VM inv eng 7 on hub 0
kernel: amdgpu 0000:0c:00.0: ring comp_1.1.1 uses VM inv eng 8 on hub 0
kernel: amdgpu 0000:0c:00.0: ring comp_1.2.1 uses VM inv eng 9 on hub 0
kernel: amdgpu 0000:0c:00.0: ring comp_1.3.1 uses VM inv eng 10 on hub 0
kernel: amdgpu 0000:0c:00.0: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
kernel: amdgpu 0000:0c:00.0: ring sdma0 uses VM inv eng 0 on hub 1
kernel: amdgpu 0000:0c:00.0: ring page0 uses VM inv eng 1 on hub 1
kernel: amdgpu 0000:0c:00.0: ring sdma1 uses VM inv eng 4 on hub 1
kernel: amdgpu 0000:0c:00.0: ring page1 uses VM inv eng 5 on hub 1
kernel: amdgpu 0000:0c:00.0: ring uvd_0 uses VM inv eng 6 on hub 1
kernel: amdgpu 0000:0c:00.0: ring uvd_enc_0.0 uses VM inv eng 7 on hub 1
kernel: amdgpu 0000:0c:00.0: ring uvd_enc_0.1 uses VM inv eng 8 on hub 1
kernel: amdgpu 0000:0c:00.0: ring uvd_1 uses VM inv eng 9 on hub 1
kernel: amdgpu 0000:0c:00.0: ring uvd_enc_1.0 uses VM inv eng 10 on hub 1
kernel: amdgpu 0000:0c:00.0: ring uvd_enc_1.1 uses VM inv eng 11 on hub 1
kernel: amdgpu 0000:0c:00.0: ring vce0 uses VM inv eng 12 on hub 1
kernel: amdgpu 0000:0c:00.0: ring vce1 uses VM inv eng 13 on hub 1
kernel: amdgpu 0000:0c:00.0: ring vce2 uses VM inv eng 14 on hub 1
kernel: [drm] Initialized amdgpu 3.36.0 20150101 for 0000:0c:00.0 on minor 0
kernel: amdgpu 0000:0d:00.0: remove_conflicting_pci_framebuffers: bar 0: 0xc0000000 -> 0xcfffffff
kernel: amdgpu 0000:0d:00.0: remove_conflicting_pci_framebuffers: bar 2: 0xd0000000 -> 0xd01fffff
kernel: amdgpu 0000:0d:00.0: remove_conflicting_pci_framebuffers: bar 5: 0xfce00000 -> 0xfce3ffff
kernel: amdgpu 0000:0d:00.0: enabling device (0000 -> 0003)
kernel: amdgpu 0000:0d:00.0: VRAM: 8192M 0x000000F400000000 - 0x000000F5FFFFFFFF (8192M used)
kernel: amdgpu 0000:0d:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
kernel: [drm] amdgpu: 8192M of VRAM memory ready
kernel: [drm] amdgpu: 8192M of GTT memory ready.
kernel: amdgpu: [powerplay] hwmgr_sw_init smu backed is polaris10_smu
kernel: [drm] Initialized amdgpu 3.36.0 20150101 for 0000:0d:00.0 on minor 1
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
sensors[1279]: amdgpu-pci-0c00
sensors[1279]: amdgpu-pci-0d00
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
kernel: EDAC amd64: Node 0: DRAM ECC disabled.
kernel: EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.

Please help!  Lol, Not sure why this is happening.

0 Likes
18 Replies

Really not sure if this 2019 GITHUB thread is of any use in your case or not. But the User needed to install ROCm to get OpenCl to work on his Radeon VII GPU Card: ROCm installation flummoxed -Radeon VII, Ubuntu 18.04 - cancel · Issue #860 · RadeonOpenCompute/ROCm... 

You might want to post your question here: Newcomers Start Here‌ so that you can be "Whitelisted" to post at AMD OpenCL Forum: OpenCL 

Just posted my request to be whitelisted, and I'll try to find some installation instructions for rocm on Ubuntu 18.04.04HWE

EDIT: https://rocm-documentation.readthedocs.io/en/latest/Installation_Guide/Installation-Guide.html#ubunt... 

Update, I've removed the amdgpu-pro driver using the amdgpu-pro-uninstall script, rebooted, and then was attempting to install rocm.  Unfortunately, it has a version mistmatch with gcc-7, so I'll post about that on the OpenCL‌ forum, pending whitelist.

The following package is a depency of hcc, required by rocm-dev, which is a part of rocm-dkms:

$ sudo apt install gcc-7-multilib
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 gcc-7-multilib : Depends: gcc-7-base (= 7.4.0-1ubuntu1~18.04) but 7.5.0-3ubuntu1~18.04 is to be installed
                  Depends: gcc-7 (= 7.4.0-1ubuntu1~18.04) but 7.5.0-3ubuntu1~18.04 is to be installed
                  Depends: lib32gcc-7-dev (= 7.4.0-1ubuntu1~18.04) but it is not going to be installed
                  Depends: libx32gcc-7-dev (= 7.4.0-1ubuntu1~18.04) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
0 Likes

To be "Whitelisted" you first must state your problem that you are having in detail Newcomers Start HereJust copy the post you made here and paste it on the Thread you opened and change the title to the same one you have here. Then the Moderators decide which DevGURU Forum it belongs and whether to Whitelist you.

So post the same problem you posted here with all your other replies and see if the Moderators will give you access to AMD OpenCL Forum if they feel it is applicable there or here at this Forum.

The way you posted your question will never get any attention from the Moderators since they don't know the issue you are having.

packed

Updated, thanks!

EDIT:  As a side note, installing the headless version of the driver still only allowed me to use the RX 580 and not the Radeon VII for OpenCL.  I always completely reboot after installing/uninstalling the driver, and it is currently completely removed - I'm using the open source driver by default for graphics at the moment with no OpenCL/proprietary drivers installed.

Request to get whitelisted -> https://community.amd.com/thread/252105

makeitwork
Adept II

UPDATE

I've filed a bug report with Ubuntu on their gcc-7 package, since gcc-7-multilib is failing to install on Ubuntu 18.04 HWE.  This is a requirement for rocm-dkms under rocm-dev and hcc and does not appear to be the fault of the rocm maintainers at all.  Attempting to install gcc-7-multilib results in a version mismatch between gcc 7.4 and 7.5.

So, I'm still having the problem.  The fix is (hopefully) installing rocm-dkms which requires this bug to be fixed on Ubuntu's side of things.  Otherwise I'll have to wait for an updated amdgpu-pro driver which has OpenCL support for the Radeon VII.

Link to the bug report: https://bugs.launchpad.net/ubuntu/+source/gcc-7/+bug/1875224

0 Likes

UPDATE

The rocm-dkms problem was due to a bad apt mirror, and is now installed.  Now I can show both the cards on ROCm's clinfo.  However, no programs that actually use OpenCL work now.  An example with hashcat:

$ hashcat -b
hashcat (v5.1.0-1426-gb02fe8e0) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

OpenCL API (OpenCL 2.1 AMD-APP (3098.0)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #1: gfx906+sram-ecc, 13912/16368 MB allocatable, 60MCU
* Device #2: gfx803, 6963/8192 MB allocatable, 36MCU

Benchmark relevant options:
===========================
* --optimized-kernel-enable

Hashmode: 0 - MD5

clBuildProgram(): CL_BUILD_PROGRAM_FAILURE

Started: Sun Apr 26 17:23:03 2020
Stopped: Sun Apr 26 17:23:05 2020

And with boinc:

$ boinc
26-Apr-2020 17:23:29 [---] cc_config.xml not found - using defaults
26-Apr-2020 17:23:29 [---] Starting BOINC client version 7.9.3 for x86_64-pc-linux-gnu
26-Apr-2020 17:23:29 [---] log flags: file_xfer, sched_ops, task
26-Apr-2020 17:23:29 [---] Libraries: libcurl/7.58.0 OpenSSL/1.1.1 zlib/1.2.11 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) nghttp2/1.30.0 librtmp/2.3
26-Apr-2020 17:23:29 [---] Data directory: /home/user
execv: No such file or directory
26-Apr-2020 17:23:29 [---] GPU detection failed. error code 512
26-Apr-2020 17:23:29 [---] No usable GPUs found

...

And ethdcrminer64 - a GPU mining program called Claymore's:

...

AMD Cards available: 2
GPU #0: gfx906+sram-ecc (Vega 20), 16368 MB available, 60 compute units (pci bus 12:0:0)
GPU #0 recognized as Vega
GPU #1: gfx803 (Ellesmere [Radeon RX 470/480/570/570X/580/580X]), 8192 MB available, 36 compute units (pci bus 13:0:0)
POOL/SOLO version
AMD ADL library not found.
Cannot build OpenCL program for GPU 0
Cannot build OpenCL program for GPU 1

...

0 Likes

From GITHUB: GitHub - RadeonOpenCompute/ROCm: ROCm - Open Source Platform for HPC and Ultrascale GPU Computing 

Supported Operating Systems

The ROCm v3.3.x platform is designed to support the following operating systems:

  • Ubuntu 16.04.6 (Kernel 4.15) and 18.04.4 (Kernel 5.3)

  • CentOS v7.7 (Using devtoolset-7 runtime support)

  • RHEL v7.7 (Using devtoolset-7 runtime support)

  • SLES 15 SP1

https://github.com/RadeonOpenCompute/ROCm#important-rocm-links

Important ROCm Links

Access the following links for more information on:

Note: These instructions reference the rocm/pytorch:rocm3.0_ubuntu16.04_py2.7_pytorch image. However, you can substitute the Ubuntu 18.04 image listed at https://hub.docker.com/r/rocm/pytorch/tags

0 Likes

Here are the OpenCL libraries for ROCm: GitHub - RadeonOpenCompute/ROCm-OpenCL-Runtime at roc-3.3.0  and how to install them.

Hopefully your thread will be moved if OpenCL Moderator believe it is applicable. Most likely on Monday.

Just confirming that I'm running Ubuntu 18.04.04 HWE and that my kernel is 5.3.0-46-generic

I got everything in the ROCm-OpenCL-Runtime github repo working up to the make command.  It fails about halfway through.  First, it was because it couldn't find any OpenCL headers (#include <CL/any-file-name.h wasn't working).  I tried to use the latest OpenCL headers referenced in the ocl-icd github repo, but they have compilation errors.  I'll try to reference Ubuntu's opencl headers package when I get more time to try and get this working.

0 Likes
makeitwork
Adept II

Just to update, I still can't get ROCm Runtime to compile.  I'm aware that I should post that problem to their github project, and I will when I get time (it's been a busy week).  Any work toward getting the Radeon VII OpenCL component working on the amdgpu-pro driver would be tremendously appreciated on my behalf.

Further replies to this thread should probably happen on the OpenCL forum's request, but I'm subscribed to updates for either if a solution is found.

0 Likes

In that case, I suggest you then install AMD GPU PRO driver so that Users in OpenCL can help you with it. 

With ROCm, otherwise you will need to post at GITHUB.

0 Likes

...The entire point of this thread has been that Radeon VII is not recognized by the amdgpu-pro driver as an OpenCL device...

...I followed your recommendation to try and get ROCm to work...  I'll continue to do that on the ROCm github when I get time...

I can remove ROCm and install amdgpu-pro at any time and that's what I'd prefer to do.

0 Likes

I realize that was your original problem with the Radeon VII and amdgpu-pro driver.

Hopefully ROCm will finally get OpenCL to work on your GPU card.

But if that ends up in a dead end, you can always install amdgpu-pro driver again and ask the Users at OpenCL Forum to see if anyone was able to get the Radeon VII to work in OpenCL.

Anyway, I have nothing else to offer as far as suggestions goes.

Hopefully someone at Github or OpenCL forum will eventually solve your problem if you don't find out the answer yourself.

0 Likes
hwh57
Journeyman III

Did you ever get anywhere with this issue I wonder? I have the same problem on Ubuntu 20.04 using an RX570 and RX5700XT in the same system. Individually they work fine but never both, it favours the 570 if both are connected.

This with the newly released (as of July 2020) driver from here: https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-20 

For what it's worth the same hardware booting into Windows 10 works fine, i.e. both GPUs are detected and run compute jobs.

Installed with: 

$ ./amdgpu-pro-install -y --opencl=pal,legacy --no-32 --headless

$ uname -r
5.4.0-40-generic

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04 LTS"
VERSION_ID="20.04"

$ sudo lshw -c video
*-display
description: VGA compatible controller
product: Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:03:00.0
version: ef
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
configuration: driver=amdgpu latency=0
resources: irq:34 memory:b0000000-bfffffff memory:cfc00000-cfdfffff ioport:c000(size=256) memory:fbcc0000-fbcfffff memory:fbca0000-fbcbffff
*-display
description: VGA compatible controller
product: Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:07:00.0
version: c1
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
configuration: driver=amdgpu latency=0
resources: irq:35 memory:d0000000-dfffffff memory:cfe00000-cfffffff ioport:e000(size=256) memory:fbf80000-fbffffff memory:fbf60000-fbf7ffff

I ended up opening another thread for this issue in the OpenCL forum, which did not move any closer to a resolution.  There were some suggestions made.  I have not had time since then to swap out the cards individually on the hardware level, but I suspect that your problem and mine are identical.  I don't think there is a fix for this at this time.

hwh57
Journeyman III

Thanks for the update. Yes the issue sounds very similar. Following many threads in this and other forums it seems a common problem:

OpenCL PAL &amp; Legacy platforms under Ubuntu  

Unable to have two GPUs (RX 5700 XT & RX 580) recognized by Blender simultaneously - User Feedback -... 

The latter thread is the most promising. I attempted to make the suggested fix/hack to the equivalent file in the latest version of the drivers (amdgpu-pro-20.20-1098277-ubuntu-20.04) but the path string looks different:

amdgpu-hack.png

So not sure what I am supposed to change.

I've raised a ticked with AMD support for what it is worth, will see if I get a response. 

They have implemented the fix.  Update to the latest driver.