System
Issue
Since upgrading to 4.13.0-38-generic a few days ago, I've had two driver lockups with no discernible trigger. The system's responsive if I ssh in, but the GPU driver has crashed - OOPS attached. This is the second time in two days, and the previous kernel (believed to be 4.13.0-37-generic) was stable.
I'm going to downgrade to the -37 kernel and confirm that it is stable, but I've opened this thread in case anyone else is having similar issues.
But why Kernel version, 4.13.0-38-generic? I got 4.15.0-041500-generic and it works perfect.
That's the latest HWE kernel for Ubuntu 16.04 - but I'll try it with the hwe-edge which provides 4.15 kernels
Actually, it won't even compile:
CC
CC
cc1: some warnings being treated as errors
scripts/Makefile.build:324: recipe for target '/var/lib/dkms/amdgpu/17.50-552542/build/amd/amdkfd/kfd_process.o' failed
make[2]: *** [/var/lib/dkms/amdgpu/17.50-552542/build/amd/amdkfd/kfd_process.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/var/lib/dkms/amdgpu/17.50-552542/build/amd/amdkfd/kfd_peerdirect.c: In function ‘free_callback’:
/var/lib/dkms/amdgpu/17.50-552542/build/amd/amdkfd/kfd_peerdirect.c:177:2: error: implicit declaration of function ‘ACCESS_ONCE’ [-Werror=implicit-function-declaration]
ACCESS_ONCE(mem_context->free_callback_called) = 1;
^
/var/lib/dkms/amdgpu/17.50-552542/build/amd/amdkfd/kfd_peerdirect.c:177:49: error: lvalue required as left operand of assignment
ACCESS_ONCE(mem_context->free_callback_called) = 1;
^
cc1: some warnings being treated as errors
scripts/Makefile.build:324: recipe for target '/var/lib/dkms/amdgpu/17.50-552542/build/amd/amdkfd/kfd_peerdirect.o' failed
make[2]: *** [/var/lib/dkms/amdgpu/17.50-552542/build/amd/amdkfd/kfd_peerdirect.o] Error 1
scripts/Makefile.build:598: recipe for target '/var/lib/dkms/amdgpu/17.50-552542/build/amd/amdkfd' failed
make[1]: *** [/var/lib/dkms/amdgpu/17.50-552542/build/amd/amdkfd] Error 2
Makefile:1543: recipe for target '_module_/var/lib/dkms/amdgpu/17.50-552542/build' failed
make: *** [_module_/var/lib/dkms/amdgpu/17.50-552542/build] Error 2
make: Leaving directory '/usr/src/linux-headers-4.15.0-13-generic'
I just tetsted WX4100 on Ubuntu 16.04.03 using 18.Q1.1 driver and it works flawless.
Prior to installing the driver on a clean Ubuntu OS make sure you run the following commands:
Please make sure you use the following command to install since you have a Pre Vega 10 board.
Even on a 4.13.0-38-generic kernel