Hi,
I've been having some crashing behavior recently, related to my Radeon RX 6750 graphics card and/or driver. Something like the following line appears in /var/log/syslog every time I get a crash:
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=11827, emitted seq=11830
I first noticed the issue a few weeks ago when a game I was playing crashed suddenly, with the display freezing and subsequently being sent to a login screen. For whatever reason, said game and other games no longer start properly, and exhibit this same crashing behavior. Since then I've noticed that the crashing behavior also triggers in very niche, non-game instances, like restoring Spotify from being "minimized" sometimes. I've been able to reproduce the behavior by enabling hardware acceleration in Firefox and clicking the "notifications" button on LinkedIn. A crash report I have contains the following line under the "GraphicsCriticalError" crash annotation:
|[0][GFX1-]: Detect DeviceReset DeviceResetReason::RESET DeviceResetDetectPlace::WR_POST_UPDATE in Parent process (t=1284.62) |[1][GFX1-]: Failed to create EGLSurface!: 0x3000 (t=1285.67) |[2][GFX1-]: Failed to create EGLSurface. 1 renderers, 0 active. (t=1285.67)
I'm using Ubuntu, and have tried upgrading my system to 24.04 from 22.04, changing graphics drivers, upgrading my motherboard firmware, and clearing some caches, but nothing has seemed to work. Any ideas? I suspect that it's a Linux kernel issue, but haven't been able to get package/system management to play nice with booting into an older kernel and re-compiling the graphics drivers for said kernel.