I just installed the driver 17.40-492261 under openSUSE Leap 42.3. But i have a problem.
I get multiple messages like this:
[ 900.642188] BUG: sleeping function called from invalid context at ../mm/slab.c:2852
[ 900.642191] in_atomic(): 1, irqs_disabled(): 0, pid: 3100, name: firefox
[ 900.642204] CPU: 3 PID: 3100 Comm: firefox Tainted: G O 4.4.92-31-default #1
[ 900.642205] Hardware name: System manufacturer System Product Name/P5QL/EPU, BIOS 0408 07/20/2009
[ 900.642208] 0000000000000000 ffffffff8133a1b7 00000000014000c0 0000000000000030
[ 900.642209] ffffffff811f1226 0000000000000001 0000000000000001 0000000000000003
[ 900.642211] ffffffff810b9665 ffff8802092abc88 ffffffff014000c0 ffff88021ba42bc0
[ 900.642211] Call Trace:
[ 900.642225] [<ffffffff81019f29>] dump_trace+0x59/0x320
[ 900.642227] [<ffffffff8101a2ea>] show_stack_log_lvl+0xfa/0x180
[ 900.642229] [<ffffffff8101b091>] show_stack+0x21/0x40
[ 900.642232] [<ffffffff8133a1b7>] dump_stack+0x5c/0x85
[ 900.642235] [<ffffffff811f1226>] __kmalloc+0x146/0x4e0
[ 900.642244] [<ffffffffa046d5cc>] _kcl_reservation_object_copy_fences+0x3c/0x1b0 [amdkcl]
[ 900.642261] [<ffffffffa05fd16d>] ttm_bo_release+0x1bd/0x370 [amdttm]
[ 900.642357] [<ffffffffa07c02b5>] amdgpu_bo_unref+0x25/0x40 [amdgpu]
[ 900.642388] [<ffffffffa07d7594>] amdgpu_vm_free_levels+0x74/0xb0 [amdgpu]
[ 900.642419] [<ffffffffa07d75b4>] amdgpu_vm_free_levels+0x94/0xb0 [amdgpu]
[ 900.642448] [<ffffffffa07dbf40>] amdgpu_vm_fini+0x200/0x300 [amdgpu]
[ 900.642475] [<ffffffffa07b1925>] amdgpu_driver_postclose_kms+0x125/0x1f0 [amdgpu]
[ 900.642507] [<ffffffffa03b5e8c>] drm_release+0x24c/0x4e0 [drm]
[ 900.642511] [<ffffffff81211ae0>] __fput+0xe0/0x210
[ 900.642515] [<ffffffff8109dcd2>] task_work_run+0x72/0xa0
[ 900.642518] [<ffffffff8108335f>] do_exit+0x2ef/0xb60
[ 900.642521] [<ffffffff81083c49>] do_group_exit+0x39/0xa0
[ 900.642522] [<ffffffff81083cc0>] SyS_exit_group+0x10/0x10
[ 900.642526] [<ffffffff816314b2>] entry_SYSCALL_64_fastpath+0x16/0x71
[ 900.643940] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x16/0x71
[ 900.643941] Leftover inexact backtrace:
I have attached the full dmesg output.
Has anybody the same problem?
I have modified the driver source code. Changed the second parameter from the kmalloc call in the function
_kcl_reservation_object_copy_fences, file kcl_reservation.c under usr/src/amdgpu-17.40-492261/amd/amdkcl
from GFP_KERNEL to GFP_ATOMIC.
Now the issue is fixed for me.
I'm a newbie in kernel programming. But I think, this was the reason:
GFP_KERNEL isn’t always the right allocation flag to use; sometimes kmalloc
is called from outside a process’s context. This type of call can happen, for instance,
in interrupt handlers, tasklets, and kernel timers. In this case, the current
process should not be put to sleep, and the driver should use a flag of GFP_ATOMIC
instead
(Quote from Linux Device Drivers, Third Edition [LWN.net] Chapter 8 Allocating Memory)
Attached a patch which fixes this issue.
You can apply the patch with my description in the thread amdgpu-pro 16.60: building kernel module (amdgpu-pro-dkms) fails on openSUSE Leap 42.2.
Can anybody test this and give me some feedback?
Edited on 2017-12-06
Reason: Corrected two typos
Thank you for the patch. I'm planning to use AMDGPU-PRO drivers for the first time with my laptop dGPU (AMD Radeon 520 GCN 1.0-1.1/SI). Do you know, if the SI and CIK (GCN 1.0,1.1,1.2) support is better in amdgpu-pro than the open source amdgpu (I'm using PADOKA ppa).
Sorry, for the late response.
I cannot answer this question. Have a RX580 (Polaris20/GCN 4) currently. Had a R9 270X (GCN 1.0 - I think) before. But at that time (2016), there was no support for this card in the amdgpu/andgpu-pro driver.