cancel
Showing results for 
Search instead for 
Did you mean: 

PC Drivers & Software

tweakit
Journeyman III

Linux - amdgpu ring gfx timeouts in various applications.

Hi,

 

amdgpu has been crashing on my computer since day 1, various applications using the iGPU make it crash, but I can reliably crash VS Code, it always results in a amdgpu ring gfx timeout (see attached dmesg trace) when I edit code in a certain project. Please inform me what I can do to help you find out what's going on. I'm quite knowledgable, so please go ahead, suggest anything to give you more information, because my computer is basically unusable at the moment and I'm pretty frustrated.

 

Things I've tried

  • Adapting every possible BIOS setting I can think of.
  • Trying old kernels, new kernels ( Linux 6.12-rc6 still exhibits the crash)
  • All BIOS versions of my Deskmini.
  • Trying linux-firmware going back to june 2023.
  • Use HDMI instead of DisplayPort (same edit: I stand corrected, I can't reproduce with HDMI anymore, only DisplayPort. Makes my computer usable again, but can't use the full refresh rate of my screen, so it's not optima l).
  • Lower refreshrate from 120Hz to 60Hz (same).

 

Hardware

ASRock Deskmini x600

Ryzen 7700

64GB DDR5-RAM SODIMM @6000Mhz (crashes are unrelated with memory speed)

Acer CP3271KP 4K 120Hz screen.

 

Software

Linux sophia 6.11.4-301.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Oct 20 15:02:33 UTC 2024 x86_64 GNU/Linux

 

dmesg (shortened, full dmesg trace available here )

Around 3.204275, there's a DisplayPort issue during boot, happens every boot. Near the end, there's the dreaded `ring gfx_0.1.0 timeout`. I can reliably, 100% reproduce this in VS Code. Crashes every time. Other applications like Gnome termial, Brave browser, also crash. Don't know if the DisplayPort setup issue is related.

```

[ 0.000000] Linux version 6.11.4-301.fc41.x86_64 (mockbuild@9b6b61418589428cb880a7020233b56f) (gcc (GCC) 14.2.1 20240912 (Red Hat 14.2.1-3), GNU ld version 2.43.1-2.fc41) #1 SMP PREEMPT_DYNAMIC Sun Oct 20 15:02:33 UTC 2024
** REMOVED ** 
[ 0.342098] AMD-Vi: Extended features (0x246577efa2254afa, 0x0): PPR NX GT [5] IA GA PC GA_vAPIC
[ 0.342104] AMD-Vi: Interrupt remapping enabled
[ 0.391972] AMD-Vi: Virtual APIC enabled
[ 0.391981] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 0.391983] software IO TLB: mapped [mem 0x0000000079fc7000-0x000000007dfc7000] (64MB)
[ 0.392035] LVT offset 0 assigned for vector 0x400
[ 0.392162] perf: AMD IBS detected (0x00000bff)
[ 0.392227] amd_uncore: 16 amd_df counters detected
[ 0.392233] amd_uncore: 6 amd_l3 counters detected
[ 0.392237] amd_uncore: 4 amd_umc_0 counters detected
[ 0.392239] amd_uncore: 4 amd_umc_1 counters detected
[ 0.392294] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
** REMOVED ** 
[ 0.633662] [drm] Initialized simpledrm 1.0.0 for simple-framebuffer.0 on minor 0
[ 0.634047] fbcon: Deferring console take-over
[ 0.634048] simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device
[ 0.634092] ccp 0000:05:00.2: enabling device (0000 -> 0002)
[ 0.635777] ccp 0000:05:00.2: tee enabled
[ 0.635940] ccp 0000:05:00.2: psp enabled
[ 0.635969] hid: raw HID events driver (C) Jiri Kosina
[ 0.635986] usbcore: registered new interface driver usbhid
[ 0.635987] usbhid: USB HID core driver
[ 0.636054] drop_monitor: Initializing network drop monitor service
[ 0.636113] Initializing XFRM netlink socket
[ 0.636132] NET: Registered PF_INET6 protocol family
[ 0.638321] Segment Routing with IPv6
[ 0.638323] RPL Segment Routing with IPv6
[ 0.638329] In-situ OAM (IOAM) with IPv6
[ 0.638345] mip6: Mobile IPv6
[ 0.638347] NET: Registered PF_PACKET protocol family
[ 0.639135] microcode: Current revision: 0x0a601206
[ 0.639325] resctrl: L3 allocation detected
[ 0.639326] resctrl: MB allocation detected
[ 0.639327] resctrl: SMBA allocation detected
[ 0.639327] resctrl: L3 monitoring detected
[ 0.639348] IPI shorthand broadcast: enabled
[ 0.639404] AES CTR mode by8 optimization enabled
[ 0.641298] sched_clock: Marking stable (639404341, 1366624)->(656332383, -15561418)
[ 0.641380] registered taskstats version 1
[ 0.641642] Loading compiled-in X.509 certificates
[ 0.642320] Loaded X.509 cert 'Fedora kernel signing key: e799dbf590b7c1bd2971a3affe1a4f27baa5c739'
[ 0.647663] Loaded X.509 cert 'Fedora IMA CA: a8a00c31663f853f9c6ff2564872e378af026b28'
[ 0.651931] Demotion targets for Node 0: null
[ 0.651933] page_owner is disabled
[ 0.651966] Key type .fscrypt registered
[ 0.651967] Key type fscrypt-provisioning registered
[ 0.652413] Btrfs loaded, zoned=yes, fsverity=yes
[ 0.652427] Key type big_key registered
[ 0.652432] Key type trusted registered
[ 0.664958] Key type encrypted registered
[ 0.665013] integrity: Loading X.509 certificate: UEFI:db
[ 0.665030] integrity: Loaded X.509 cert 'Microsoft Corporation UEFI CA 2011: 13adbf4309bd82709c8cd54f316ed522988a1bd4'
[ 0.665032] integrity: Loading X.509 certificate: UEFI:db
[ 0.665042] integrity: Loaded X.509 cert 'Microsoft Windows Production PCA 2011: a92902398e16c49778cd90f99e4f9ae17c55af53'
[ 0.665718] Loading compiled-in module X.509 certificates
[ 0.665985] Loaded X.509 cert 'Fedora kernel signing key: e799dbf590b7c1bd2971a3affe1a4f27baa5c739'
[ 0.665987] ima: Allocated hash algorithm: sha256
[ 0.751832] ima: No architecture policies found
[ 0.751846] evm: Initialising EVM extended attributes:
[ 0.751847] evm: security.selinux
[ 0.751848] evm: security.SMACK64 (disabled)
[ 0.751848] evm: security.SMACK64EXEC (disabled)
[ 0.751849] evm: security.SMACK64TRANSMUTE (disabled)
[ 0.751850] evm: security.SMACK64MMAP (disabled)
[ 0.751850] evm: security.apparmor (disabled)
[ 0.751851] evm: security.ima
[ 0.751851] evm: security.capability
[ 0.751852] evm: HMAC attrs: 0x1
[ 0.790750] alg: No test for 842 (842-scomp)
[ 0.790768] alg: No test for 842 (842-generic)
[ 0.856553] PM: Magic number: 0:884:440
[ 0.856575] clockevents broadcast: hash matches
[ 0.859232] RAS: Correctable Errors collector initialized.
[ 0.867039] clk: Disabling unused clocks
[ 0.867041] PM: genpd: Disabling unused power domains
[ 0.867369] usb 3-1: new full-speed USB device number 2 using xhci_hcd
[ 0.867381] usb 5-1: new high-speed USB device number 2 using xhci_hcd
[ 0.935717] ata1: SATA link down (SStatus 0 SControl 300)
[ 0.996648] usb 5-1: New USB device found, idVendor=05e3, idProduct=0610, bcdDevice=60.60
[ 0.996650] usb 5-1: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[ 0.996651] usb 5-1: Product: USB2.0 Hub
[ 1.040203] usb 3-1: New USB device found, idVendor=21b4, idProduct=0081, bcdDevice= 1.20
[ 1.040205] usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1.040206] usb 3-1: Product: AudioQuest DragonFly
[ 1.040207] usb 3-1: Manufacturer: AudioQuest inc.
[ 1.040208] usb 3-1: SerialNumber: (C) 2013 Wavelength Audio, ltd.
[ 1.048421] hub 5-1:1.0: USB hub found
[ 1.049026] hub 5-1:1.0: 4 ports detected
[ 1.112574] usb 4-2: new SuperSpeed USB device number 2 using xhci_hcd
[ 1.132047] usb 4-2: New USB device found, idVendor=0bda, idProduct=0411, bcdDevice= 1.01
[ 1.132049] usb 4-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 1.132050] usb 4-2: Product: USB3.2 Hub
[ 1.132051] usb 4-2: Manufacturer: Generic
[ 1.157477] hub 4-2:1.0: USB hub found
[ 1.158210] hub 4-2:1.0: 4 ports detected
[ 1.241602] usb 3-2: new high-speed USB device number 3 using xhci_hcd
[ 1.248656] ata2: SATA link down (SStatus 0 SControl 300)
[ 1.249628] Freeing unused decrypted memory: 2028K
[ 1.249916] Freeing unused kernel image (initmem) memory: 4776K
[ 1.249922] Write protecting the kernel read-only data: 36864k
[ 1.250090] Freeing unused kernel image (rodata/data gap) memory: 352K
[ 1.278344] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 1.278347] Run /init as init process
[ 1.278349] with arguments:
[ 1.278350] /init
[ 1.278350] rhgb
[ 1.278351] with environment:
[ 1.278352] HOME=/
[ 1.278352] TERM=linux
[ 1.278353] BOOT_IMAGE=(hd0,gpt2)/vmlinuz-6.11.4-301.fc41.x86_64
** REMOVED ** 
[ 3.039336] [drm] amdgpu kernel modesetting enabled.
[ 3.044564] amdgpu: Virtual CRAT table created for CPU
[ 3.044575] amdgpu: Topology: Add CPU node
[ 3.044654] amdgpu 0000:05:00.0: enabling device (0006 -> 0007)
[ 3.044678] [drm] initializing kernel modesetting (IP DISCOVERY 0x1002:0x164E 0x1002:0x164E 0xC5).
[ 3.044685] [drm] register mmio base: 0xF6A00000
[ 3.044686] [drm] register mmio size: 524288
[ 3.046284] [drm] add ip block number 0 <nv_common>
[ 3.046286] [drm] add ip block number 1 <gmc_v10_0>
[ 3.046287] [drm] add ip block number 2 <navi10_ih>
[ 3.046287] [drm] add ip block number 3 <psp>
[ 3.046288] [drm] add ip block number 4 <smu>
[ 3.046289] [drm] add ip block number 5 <dm>
[ 3.046290] [drm] add ip block number 6 <gfx_v10_0>
[ 3.046291] [drm] add ip block number 7 <sdma_v5_2>
[ 3.046292] [drm] add ip block number 8 <vcn_v3_0>
[ 3.046292] [drm] add ip block number 9 <jpeg_v3_0>
[ 3.046300] amdgpu 0000:05:00.0: amdgpu: Fetched VBIOS from VFCT
[ 3.046302] amdgpu: ATOM BIOS: 102-RAPHAEL-008
[ 3.060641] BTRFS: device label fedora_fedora devid 1 transid 38833 /dev/nvme0n1p3 (259:3) scanned by mount (628)
[ 3.060829] BTRFS info (device nvme0n1p3): first mount of filesystem 827e3102-a5fb-4412-8b93-c6f1d4b0d770
[ 3.060836] BTRFS info (device nvme0n1p3): using crc32c (crc32c-intel) checksum algorithm
[ 3.060840] BTRFS info (device nvme0n1p3): using free-space-tree
[ 3.070441] hid-generic 0003:04D9:1919.0001: input,hidraw0: USB HID v1.10 Keyboard [DasKeyboard] on usb-0000:05:00.4-2.2/input0
[ 3.081533] input: DasKeyboard as /devices/pci0000:00/0000:00:08.1/0000:05:00.4/usb3/3-2/3-2.2/3-2.2:1.1/0003:04D9:1919.0002/input/input4
[ 3.104517] amdgpu 0000:05:00.0: vgaarb: deactivate vga console
[ 3.104520] amdgpu 0000:05:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[ 3.104619] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
[ 3.104624] amdgpu 0000:05:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[ 3.104626] amdgpu 0000:05:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[ 3.104630] [drm] Detected VRAM RAM=512M, BAR=512M
[ 3.104631] [drm] RAM width 128bits DDR5
[ 3.104709] [drm] amdgpu: 512M of VRAM memory ready
[ 3.104711] [drm] amdgpu: 31683M of GTT memory ready.
[ 3.104722] [drm] GART: num cpu pages 262144, num gpu pages 262144
[ 3.104825] [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
[ 3.105078] [drm] Loading DMUB firmware via PSP: version=0x05001C00
[ 3.105406] [drm] use_doorbell being set to: [true]
[ 3.105417] [drm] Found VCN firmware Version ENC: 1.31 DEC: 3 VEP: 0 Revision: 3
[ 3.127480] amdgpu 0000:05:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 for PSP TMR
[ 3.133437] hid-generic 0003:04D9:1919.0002: input,hidraw1: USB HID v1.10 Device [DasKeyboard] on usb-0000:05:00.4-2.2/input1
[ 3.189474] amdgpu 0000:05:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 3.195265] amdgpu 0000:05:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 3.195267] amdgpu 0000:05:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[ 3.196450] amdgpu 0000:05:00.0: amdgpu: SMU is initialized successfully!
[ 3.196455] [drm] Seamless boot condition check passed
[ 3.196788] [drm] Display Core v3.2.291 initialized on DCN 3.1.5
[ 3.196790] [drm] DP-HDMI FRL PCON supported
[ 3.197546] [drm] DMUB hardware initialized: version=0x05001C00
[ 3.204275] ------------[ cut here ]------------
[ 3.204277] WARNING: CPU: 1 PID: 519 at drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c:1544 dp_retrieve_lttpr_cap+0x121/0x1e0 [amdgpu]
[ 3.204637] Modules linked in: amdgpu(+) amdxcp i2c_algo_bit drm_ttm_helper ttm crct10dif_pclmul drm_exec crc32_pclmul gpu_sched crc32c_intel polyval_clmulni drm_suballoc_helper polyval_generic drm_buddy nvme ghash_clmulni_intel drm_display_helper sha512_ssse3 nvme_core sha256_ssse3 sha1_ssse3 cec sp5100_tco nvme_auth video wmi ip6_tables ip_tables fuse
[ 3.204665] CPU: 1 UID: 0 PID: 519 Comm: (udev-worker) Not tainted 6.11.4-301.fc41.x86_64 #1
[ 3.204669] Hardware name: ASRock X600M-STX/X600M-STX, BIOS 4.03 07/11/2024
[ 3.204670] RIP: 0010:dp_retrieve_lttpr_cap+0x121/0x1e0 [amdgpu]
[ 3.204988] Code: 48 21 c8 48 c1 e2 38 48 09 d0 48 89 85 98 02 00 00 f6 85 c4 02 00 00 02 74 42 e8 2a eb ff ff 84 c0 75 39 48 8b 85 d8 01 00 00 <0f> 0b c6 85 9c 02 00 00 80 48 8b 40 10 48 8b 30 48 85 f6 74 04 48
[ 3.204989] RSP: 0018:ffffa11ac0d7f3d0 EFLAGS: 00010246
[ 3.204991] RAX: ffff8d9f475f0500 RBX: 0000000000000001 RCX: 00ffffffffffffff
[ 3.204993] RDX: 0000000000000007 RSI: ffffa11ac0d7f3d0 RDI: 0000000000000000
[ 3.204994] RBP: ffff8d9f58407000 R08: ffff8d9f51ea1280 R09: 00000000000f0000
[ 3.204995] R10: 0000000000000154 R11: 0000000000000001 R12: 0000000000000001
[ 3.204996] R13: ffff8d9f58407000 R14: 0000000000000020 R15: 0000000000000000
[ 3.204997] FS: 00007f5279b5a040(0000) GS:ffff8dae3e080000(0000) knlGS:0000000000000000
[ 3.204999] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.205000] CR2: 00007f700a7f4120 CR3: 000000010fb64000 CR4: 0000000000f50ef0
[ 3.205001] PKRU: 55555554
[ 3.205002] Call Trace:
[ 3.205005] <TASK>
[ 3.205006] ? dp_retrieve_lttpr_cap+0x121/0x1e0 [amdgpu]
[ 3.205222] ? __warn.cold+0x8e/0xe8
[ 3.205225] ? dp_retrieve_lttpr_cap+0x121/0x1e0 [amdgpu]
[ 3.205453] ? report_bug+0xff/0x140
[ 3.205456] ? handle_bug+0x3c/0x80
[ 3.205458] ? exc_invalid_op+0x17/0x70
[ 3.205459] ? asm_exc_invalid_op+0x1a/0x20
[ 3.205463] ? dp_retrieve_lttpr_cap+0x121/0x1e0 [amdgpu]
[ 3.205652] retrieve_link_cap+0x79/0xd90 [amdgpu]
[ 3.205830] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.205832] detect_link_and_local_sink+0xc40/0x1210 [amdgpu]
[ 3.206092] ? wq_update_node_max_active+0x12b/0x260
[ 3.206100] link_detect+0x37/0x540 [amdgpu]
[ 3.206331] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.206333] ? dal_gpio_destroy_irq+0x25/0x40 [amdgpu]
[ 3.206598] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.206599] ? query_hpd_status+0x6e/0xa0 [amdgpu]
[ 3.206794] amdgpu_dm_init+0x107d/0x2580 [amdgpu]
[ 3.206994] ? __pfx_enable_assr+0x10/0x10 [amdgpu]
[ 3.207180] ? __pfx_update_config+0x10/0x10 [amdgpu]
[ 3.207360] dm_hw_init+0x13/0x30 [amdgpu]
[ 3.207537] amdgpu_device_init.cold+0x1c4d/0x1f9a [amdgpu]
[ 3.207736] ? pci_bus_read_config_word+0x4a/0x90
[ 3.207740] amdgpu_driver_load_kms+0x19/0x70 [amdgpu]
[ 3.207890] amdgpu_pci_probe+0x1ae/0x4b0 [amdgpu]
[ 3.208033] local_pci_probe+0x42/0x90
[ 3.208036] pci_device_probe+0xc1/0x2a0
[ 3.208038] really_probe+0xdb/0x340
[ 3.208040] ? pm_runtime_barrier+0x54/0x90
[ 3.208042] ? __pfx___driver_attach+0x10/0x10
[ 3.208044] __driver_probe_device+0x78/0x110
[ 3.208045] driver_probe_device+0x1f/0xa0
[ 3.208047] __driver_attach+0xba/0x1c0
[ 3.208049] bus_for_each_dev+0x8c/0xe0
[ 3.208052] bus_add_driver+0x142/0x220
[ 3.208053] driver_register+0x72/0xd0
[ 3.208055] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[ 3.208196] do_one_initcall+0x58/0x310
[ 3.208200] do_init_module+0x90/0x260
[ 3.208202] __do_sys_init_module+0x17a/0x1b0
[ 3.208206] do_syscall_64+0x82/0x160
[ 3.208209] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208211] ? __mod_memcg_lruvec_state+0xdf/0x1e0
[ 3.208213] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208214] ? __lruvec_stat_mod_folio+0x83/0xd0
[ 3.208216] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208217] ? set_ptes.isra.0+0x41/0x90
[ 3.208219] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208220] ? do_anonymous_page+0xfc/0x8e0
[ 3.208222] ? __pte_offset_map+0x1b/0x180
[ 3.208224] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208225] ? __handle_mm_fault+0xc02/0x1040
[ 3.208227] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208228] ? __do_sys_mremap+0x30c/0x980
[ 3.208231] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208232] ? __count_memcg_events+0x75/0x130
[ 3.208234] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208235] ? count_memcg_events.constprop.0+0x1a/0x30
[ 3.208236] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208237] ? handle_mm_fault+0x21b/0x330
[ 3.208239] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208240] ? do_user_addr_fault+0x55a/0x7b0
[ 3.208243] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3.208244] ? exc_page_fault+0x7e/0x180
[ 3.208247] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 3.208249] RIP: 0033:0x7f527a3fd92e
[ 3.208251] Code: 48 8b 0d e5 24 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b2 24 0f 00 f7 d8 64 89 01 48
[ 3.208252] RSP: 002b:00007ffe34f9ba08 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[ 3.208253] RAX: ffffffffffffffda RBX: 000055ddae6b0ab0 RCX: 00007f527a3fd92e
[ 3.208254] RDX: 00007f5278ecb3bd RSI: 0000000002965356 RDI: 00007f5274e00010
[ 3.208255] RBP: 00007ffe34f9bac0 R08: 000055ddae66c010 R09: 0000000000000007
[ 3.208256] R10: 0000000000000004 R11: 0000000000000246 R12: 00007f5278ecb3bd
[ 3.208257] R13: 0000000000020000 R14: 000055ddae6afd40 R15: 000055ddae6b4e20
[ 3.208259] </TASK>
[ 3.208260] ---[ end trace 0000000000000000 ]---
[ 3.210387] usb 3-2.3: new high-speed USB device number 5 using xhci_hcd
[ 3.302347] usb 3-2.3: New USB device found, idVendor=05e3, idProduct=0608, bcdDevice=32.98
[ 3.302350] usb 3-2.3: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[ 3.302352] usb 3-2.3: Product: USB2.0 Hub
[ 3.318234] [drm] kiq ring mec 2 pipe 1 q 0
[ 3.321124] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[ 3.321132] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
[ 3.321672] amdgpu: Virtual CRAT table created for GPU
[ 3.321767] amdgpu: Topology: Add dGPU node [0x164e:0x1002]
[ 3.321769] kfd kfd: amdgpu: added device 1002:164e
[ 3.321776] amdgpu 0000:05:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 2, active_cu_number 2
[ 3.321779] amdgpu 0000:05:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 3.321781] amdgpu 0000:05:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0
[ 3.321782] amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0
[ 3.321783] amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0
[ 3.321784] amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 3.321785] amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 3.321786] amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 3.321787] amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 3.321787] amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 3.321788] amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 3.321789] amdgpu 0000:05:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0
[ 3.321790] amdgpu 0000:05:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0
[ 3.321791] amdgpu 0000:05:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
[ 3.321792] amdgpu 0000:05:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
[ 3.321793] amdgpu 0000:05:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
[ 3.321794] amdgpu 0000:05:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
[ 3.323964] amdgpu 0000:05:00.0: amdgpu: Runtime PM not available
[ 3.324305] [drm] Initialized amdgpu 3.59.0 for 0000:05:00.0 on minor 1
[ 3.330349] fbcon: amdgpudrmfb (fb0) is primary device
[ 3.330351] fbcon: Deferring console take-over
[ 3.330352] amdgpu 0000:05:00.0: [drm] fb0: amdgpudrmfb frame buffer device
** REMOVED **
[ 645.008871] amdgpu 0000:05:00.0: amdgpu: ring gfx_0.1.0 timeout, signaled seq=9386, emitted seq=9388
[ 645.008878] amdgpu 0000:05:00.0: amdgpu: Process information: process gnome-shell pid 1592 thread gnome-shel:cs0 pid 1625
[ 645.008880] amdgpu 0000:05:00.0: amdgpu: GPU reset begin!
[ 645.123713] amdgpu 0000:05:00.0: amdgpu: Dumping IP State
[ 645.124245] amdgpu 0000:05:00.0: amdgpu: Dumping IP State Completed
[ 645.124247] amdgpu 0000:05:00.0: amdgpu: MODE2 reset
[ 645.131194] amdgpu 0000:05:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 645.131296] [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
[ 645.131355] [drm] VRAM is lost due to GPU reset!
[ 645.131357] amdgpu 0000:05:00.0: amdgpu: PSP is resuming...
[ 645.152861] amdgpu 0000:05:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 for PSP TMR
[ 645.336569] amdgpu 0000:05:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 645.342161] amdgpu 0000:05:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 645.342163] amdgpu 0000:05:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[ 645.342165] amdgpu 0000:05:00.0: amdgpu: SMU is resuming...
[ 645.342537] amdgpu 0000:05:00.0: amdgpu: SMU is resumed successfully!
[ 645.343292] [drm] DMUB hardware initialized: version=0x05001C00
[ 645.416909] [drm] kiq ring mec 2 pipe 1 q 0
[ 645.419093] amdgpu 0000:05:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 645.419095] amdgpu 0000:05:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0
[ 645.419096] amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0
[ 645.419097] amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0
[ 645.419099] amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 645.419100] amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 645.419101] amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 645.419102] amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 645.419103] amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 645.419104] amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 645.419105] amdgpu 0000:05:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0
[ 645.419106] amdgpu 0000:05:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0
[ 645.419107] amdgpu 0000:05:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
[ 645.419109] amdgpu 0000:05:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
[ 645.419110] amdgpu 0000:05:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
[ 645.419111] amdgpu 0000:05:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
[ 645.424812] amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow start
[ 645.424814] amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow done
[ 645.424828] amdgpu 0000:05:00.0: amdgpu: GPU reset(2) succeeded!
[ 645.436995] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 646.399037] rfkill: input handler enabled
[ 647.256045] rfkill: input handler disabled
[ 658.030618] rfkill: input handler enabled
[ 658.721289] rfkill: input handler disabled
[ 6236.435990] amdgpu 0000:05:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=150626, emitted seq=150628
[ 6236.435996] amdgpu 0000:05:00.0: amdgpu: Process information: process code pid 31269 thread code:cs0 pid 31282
[ 6236.435999] amdgpu 0000:05:00.0: amdgpu: GPU reset begin!
[ 6236.552128] amdgpu 0000:05:00.0: amdgpu: Dumping IP State
[ 6236.552661] amdgpu 0000:05:00.0: amdgpu: Dumping IP State Completed
[ 6236.552663] amdgpu 0000:05:00.0: amdgpu: MODE2 reset
[ 6236.559990] amdgpu 0000:05:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 6236.560102] [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
[ 6236.560168] [drm] VRAM is lost due to GPU reset!
[ 6236.560169] amdgpu 0000:05:00.0: amdgpu: PSP is resuming...
[ 6236.581687] amdgpu 0000:05:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 for PSP TMR
[ 6236.763171] amdgpu 0000:05:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 6236.768866] amdgpu 0000:05:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 6236.768868] amdgpu 0000:05:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[ 6236.768870] amdgpu 0000:05:00.0: amdgpu: SMU is resuming...
[ 6236.769601] amdgpu 0000:05:00.0: amdgpu: SMU is resumed successfully!
[ 6236.770349] [drm] DMUB hardware initialized: version=0x05001C00
[ 6236.843950] [drm] kiq ring mec 2 pipe 1 q 0
[ 6236.846000] amdgpu 0000:05:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 6236.846003] amdgpu 0000:05:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0
[ 6236.846004] amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0
[ 6236.846005] amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0
[ 6236.846006] amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 6236.846007] amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 6236.846008] amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 6236.846009] amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 6236.846010] amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 6236.846012] amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 6236.846013] amdgpu 0000:05:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0
[ 6236.846014] amdgpu 0000:05:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0
[ 6236.846015] amdgpu 0000:05:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
[ 6236.846016] amdgpu 0000:05:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
[ 6236.846017] amdgpu 0000:05:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
[ 6236.846018] amdgpu 0000:05:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
[ 6236.851044] amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow start
[ 6236.851046] amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow done
[ 6236.851060] amdgpu 0000:05:00.0: amdgpu: GPU reset(4) succeeded!
[ 6237.844703] rfkill: input handler enabled
[ 6238.654849] rfkill: input handler disabled
[ 6247.738400] rfkill: input handler enabled
[ 6248.407572] rfkill: input handler disabled

```

0 Replies