cancel
Showing results for 
Search instead for 
Did you mean: 

Drivers & Software

sowson
Adept II

2 x Radeon RX 6900 XT on the Ubuntu 20.04 GNU/Linux

Hello I installed the latest drivers on the Ubuntu 20.04 and I got following error that not recognise my GPUs. Can you help to solve this?

~/Downloads/amdgpu-pro-20.45-1188099-ubuntu-20.04$ clinfo 

Number of platforms                               1

  Platform Name                                   AMD Accelerated Parallel Processing

  Platform Vendor                                 Advanced Micro Devices, Inc.

  Platform Version                                OpenCL 2.1 AMD-APP (3188.4)

  Platform Profile                                FULL_PROFILE

  Platform Extensions                             cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 

  Platform Host timer resolution                  1ns

  Platform Extensions function suffix             AMD

 

  Platform Name                                   AMD Accelerated Parallel Processing

Number of devices                                 0

 

NULL platform behavior

  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform

  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform

  clCreateContext(NULL, ...) [default]            No platform

  clCreateContext(NULL, ...) [other]              No platform

  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No devices found in platform

  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform

  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform

  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform

  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform

  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No devices found in platform

 

9 Replies
sowson
Adept II

Screen Shot 2021-02-21 at 10.57.35.png

 Here are my results after some fights with the system :-). Now the only error is as follows. Can you help? Thanks!

Memory access fault by GPU node-1 (Agent handle: 0x557045d4c970) on address 0x7f670dd88000. Reason: Page not present or supervisor privilege.

Aborted (core dumped)

0 Likes

"Memory access fault by GPU node-1 (Agent handle: 0x557045d4c970) on address 0x7f670dd88000. Reason: Page not present or supervisor privilege. Aborted (core dumped)"

I have similar error with RX 5700 XT (listed below), when I try to bake textures in Blender. As far as I know, that error is related to the drivers and should be fixed in the near future (at least the one that shows up at my place).

Memory access fault by GPU node-1 (Agent handle: 0x7f9faac09b00) on address 0x7f9e1782c000. Reason: Page not present or supervisor privilege. ./apps.sh: line 34: 84735 Aborted (core dumped) 

I have a few useful commands for this... many hours of figured them out... So, maybe someone else time will be saved.  Thanks!

Screen Shot 2021-02-21 at 13.31.23.png

 P.S. I would like to avoid special character breaks, that is why it is on the image, sorry for the inconvenience ;-). Go, Go, AMD! :D.

0 Likes

Btw, I spent almost an entire night recently making a working Azure VM with Mi25 GPU. I was trying to setup ROCm and guess one on the same Ubuntu 20.04 (I updated from 18.04), and it was not working at all on 20.45 drivers for Linux and/or ROCm... AMD, come on! Please at least check your PRO hardware with drivers.. I had no time, so I switch to "green company" at this moment... Please fix ROCm and AMDGPU as fast as possible... I invest in 4 x Radeon RX 6900 XT from eBay, so it no fun so far. I am not a gamer and not like Windows, sorry to say... I am doing professional CNN / DNN / AI /ML... please consider my frustration;-/.

I read this topic by coincidence - it seems you have built a nice rig.

As I am in related line of work (though this time round without the GPU work) I would also recommend you to go for the TRX40's great PCI v.4 support that - due to its superb throughput - has already saved me countless hours of transferring data to/from the CPU.

See the speeds with 980 PRO NVMe on RAID 0 here: https://pcpartpicker.com/b/dT7TwP

Also consider very low latency memory. Mine are far from perfect, a kind of an engineering trade-off really, until memories at over 4600 MHz CAS 17 (or lower) become available at a sensible price. The same way I chose not to get yet a decent GPU. 

 

 

Very interesting, but for my case, I forgot does  to show what is logged. Maybe that would be helpful for AMD staff to track it down :D.

Screen Shot 2021-02-21 at 23.22.07.png

0 Likes

It almost works..! I found a solution thanks to the dmesg and web search! :D. wonder if open discussions are tracked by AMD?

Screen Shot 2021-02-21 at 23.45.53.png

 to solved I tried... to add in: /etc/default/grub

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash drm.rnodes=1 radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1 intel_iommu=off"

sudo update-grub2

sudo reboot

thanks!

"GRUB_CMDLINE_LINUX_DEFAULT="quiet splash drm.rnodes=1 radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1 intel_iommu=off""

Did you get rid of the "Memory access fault by GPU node-1" with this solution?

Unfortunately NOT... yet, I hope, even if hope is not a strategy!

0 Likes