AnsweredAssumed Answered

GUI is hanging up when an OpenCL app is heavily working.

Question asked by matszpk on Dec 2, 2015
Latest reply on Mar 21, 2016 by matszpk

Hi. I encountered severe problem when any an OpenCL application is working non-stop. Just GUI (X server) is hanging up and no respond. I am using KDE Plasma 5 environment on the OpenSUSE Leap 42.1 distro. KDE desktop is running under XRender or OpenGL compositor (desktop shows special effects). I have Radeon HD 7850 and that hardware is working under latest Radeon Crimson 15.11 drivers.

Thanks to remote access, I extracted excerpt of logs (by using journalctl):

 

Nov 26 16:17:24 gigas sshd[2498]: pam_unix(sshd:session): session opened for user root by (uid=0)

Nov 26 16:19:12 gigas kernel: <6>[fglrx] ASIC hang happened

Nov 26 16:19:12 gigas kernel: CPU: 1 PID: 1439 Comm: X Tainted: P           O    4.1.12-1-default #1

Nov 26 16:19:12 gigas kernel: Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77-DS3H, BIOS F8 08/21/2012

Nov 26 16:19:12 gigas kernel:  0000000000000000 00000001000129a2 ffffffff81658898 0000000000000000

Nov 26 16:19:12 gigas kernel:  ffffffffa048437c 0000000000000000 ffffffffa05523f7 ffff88040aab7c28

Nov 26 16:19:12 gigas kernel:  ffffffffa0552356 ffffc90003141620 0000000000000001 ffffc90003140020

Nov 26 16:19:12 gigas kernel: Call Trace:

Nov 26 16:19:12 gigas kernel:  [<ffffffff8100559c>] dump_trace+0x8c/0x340

Nov 26 16:19:12 gigas kernel:  [<ffffffff8100594c>] show_stack_log_lvl+0xfc/0x1a0

Nov 26 16:19:12 gigas kernel:  [<ffffffff81006ea1>] show_stack+0x21/0x50

Nov 26 16:19:12 gigas kernel:  [<ffffffff81658898>] dump_stack+0x47/0x67

Nov 26 16:19:12 gigas kernel:  [<ffffffffa048437c>] firegl_hardwareHangRecovery+0x1c/0x30 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa05523f7>] _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x37/0x40 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa0552356>] _ZN4Asic9WaitUntil15WaitForCompleteEv+0xc6/0x130 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa054ec95>] _ZN4Asic19PM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0xd5/0x160 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa05575d9>] _ZN15ExecutableUnits35flush_all_and_invalidate_HDP_cachesE12_QS_CP_RING_+0xc9/0xf

Nov 26 16:19:12 gigas kernel:  [<ffffffffa05574be>] _ZN15ExecutableUnits8ringIdleE12_QS_CP_RING_+0x5e/0xb0 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa052cacd>] _Z17uQSPm4SynchronizemP18_QS_SYNC_PACKET_IN+0x4d/0x50 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa0527492>] _Z8uCWDDEQCmjjPvjS_+0x652/0x12c0 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa051d5ea>] CMMQS_uCWDDEQC+0xa/0x10 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa04b225f>] firegl_cmmqs_CWDDE_32+0x36f/0x480 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa04b0ace>] firegl_cmmqs_CWDDE32+0x8e/0x140 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa047e8d4>] firegl_ioctl+0x1f4/0x260 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffffa046c1ae>] ip_firegl_unlocked_ioctl+0xe/0x20 [fglrx]

Nov 26 16:19:12 gigas kernel:  [<ffffffff811f0f4f>] do_vfs_ioctl+0x2ff/0x510

Nov 26 16:19:12 gigas kernel:  [<ffffffff811f11e1>] SyS_ioctl+0x81/0xa0

Nov 26 16:19:12 gigas kernel:  [<ffffffff8165f032>] system_call_fastpath+0x16/0x75

Nov 26 16:19:12 gigas kernel:  [<00007f302d171be7>] 0x7f302d171be7

Nov 26 16:19:12 gigas kernel: pubdev:0xffffffffa123d440, num of device:1 , name:fglrx, major 15, minor 30.

Nov 26 16:19:12 gigas kernel: device 0 : 0xffff880036c6c000 .

Nov 26 16:19:12 gigas kernel: Asic ID:0x6819, revision:0x15, MMIOReg:0xffffc90003080000.

Nov 26 16:19:12 gigas kernel: FB phys addr: 0xe0000000, MC :0xf400000000, Total FB size :0x40000000.

Nov 26 16:19:12 gigas kernel: gart table MC:0xf40f7b8000, Physical:0xef7b8000, size:0x547000.

Nov 26 16:19:12 gigas kernel: mc_node :FB, total 1 zones

Nov 26 16:19:12 gigas kernel:     MC start:0xf400000000, Physical:0xe0000000, size:0xfd00000.

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x0, size:0xf7b4000, reference count:40, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0xf7b4000, size:0x4000, reference count:1, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0xf7b8000, size:0x548000, reference count:1, mapping count:0,

Nov 26 16:19:12 gigas kernel: mc_node :INV_FB, total 1 zones

Nov 26 16:19:12 gigas kernel:     MC start:0xf40fd00000, Physical:0xefd00000, size:0x30300000.

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x302ee000, size:0x12000, reference count:1, mapping count:0,

Nov 26 16:19:12 gigas kernel: mc_node :GART_USWC, total 4 zones

Nov 26 16:19:12 gigas kernel:     MC start:0xff80900000, Physical:0x0, size:0x78000000.

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x5000000, size:0x1800000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x3800000, size:0x1800000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x2000000, size:0x1800000, reference count:17, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x0, size:0x2000000, reference count:27, mapping count:0,

Nov 26 16:19:12 gigas kernel: mc_node :GART_CACHEABLE, total 4 zones

Nov 26 16:19:12 gigas kernel:     MC start:0xff50400000, Physical:0x0, size:0x30500000.

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x6100000, size:0x600000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x5b00000, size:0x600000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x3200000, size:0x400000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x4d00000, size:0x600000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x4700000, size:0x600000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x3e00000, size:0x900000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x3800000, size:0x600000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x2300000, size:0x900000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x1d00000, size:0x600000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x1400000, size:0x900000, reference count:2, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0xb00000, size:0x900000, reference count:11, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x200000, size:0x900000, reference count:4, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0x0, size:0x200000, reference count:41, mapping count:0,

Nov 26 16:19:12 gigas kernel:     Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,

Nov 26 16:19:12 gigas kernel: mc_node :PEER_FB_GART, total 1 zones

Nov 26 16:19:12 gigas kernel:     MC start:0xfff8900000, Physical:0x0, size:0x1000.

Nov 26 16:19:12 gigas kernel: GRBM : 0xa0003028, SRBM : 0x20004ec0 .

Nov 26 16:19:12 gigas kernel: CP_RB_BASE : 0xff809000, CP_RB_RPTR : 0x1a6e0 , CP_RB_WPTR :0x1a780.

Nov 26 16:19:12 gigas kernel: CP_IB1_BUFSZ:0xe0, CP_IB1_BASE_HI:0xff, CP_IB1_BASE_LO:0x80d9d000.

Nov 26 16:19:12 gigas kernel: last submit IB buffer -- MC :0xff80d9d000,phys:0x4058bc000.

Nov 26 16:19:12 gigas kernel: Dump the trace queue.

Nov 26 16:19:12 gigas kernel: End of dump

lines 27516-27602/27602 (END)

 

That problem occurred many times, while I was crunching the BOINC project that uses GPU (OpenCL app). Can anybody solve that severe problem?

Outcomes