frankas

Deadlock? (hang) when reading from pinned memory

Discussion created by frankas on Nov 16, 2009
Latest reply on Nov 20, 2009 by gaurav.garg

I am trying to improve performance on a currently working stream application, by moving to pinned memory streams. But after a short while my thread that handles Brook calls hangs forever in a mutex lock like this:

Thread 2 (Thread 0xb7ab6b90 (LWP 21941)):
#0  0xb7f4642e in __kernel_vsyscall ()
#1  0xb7f22cf9 in __lll_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0
#2  0xb7f1e129 in _L_lock_89 () from /lib/tls/i686/cmov/libpthread.so.0
#3  0xb7f1da32 in pthread_mutex_lock () from /lib/tls/i686/cmov/libpthread.so.0
#4  0xb7268d2b in brook::ThreadLock::lock () from /usr/lib/libbrook.so
#5  0xb72a80c6 in CALBuffer::initializePinnedBuffer () from /usr/lib/libbrook_cal.so
#6  0xb729ac64 in CALBufferMgr::_createPinnedBuffer () from /usr/lib/libbrook_cal.so
#7  0xb729bf07 in CALBufferMgr::setBufferData () from /usr/lib/libbrook_cal.so
#8  0xb725a093 in StreamImpl::read () from /usr/lib/libbrook.so
#9  0xb7c0b20c in brook::StreamData::read () from /usr/lib/libbrook_d.so
#10 0xb7c5dce9 in brook::Stream<uint4>::read (this=0x9e43960, ptr=0x9e54900, flags=0xb7c71c99 "nocopy")
    at /usr/local/atibrook/sdk/include/brook/StreamDef.h:160
#11 0xb7c5b49c in A5Slice::tick (this=0x9b223c8) at A5Slice.cpp:366
#12 0xb7c4b5c2 in BrookA5:rocess (this=0x9b25870) at A5Brook.cpp:139
#13 0xb7c4b637 in BrookA5::thread_stub (arg=0x9b25870) at A5Brook.cpp:52
#14 0xb7f1c4ff in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#15 0xb7e5249e in clone () from /lib/tls/i686/cmov/libc.so.6

When this first happened I issued 18 async read calls, I tried serializing the read operations with isSync calls, but the result is the same. Also it does not appear to be a general race condition, as the hang occurs after the exact same number of kernel invocations.

Since this behaviour is highly reproducible I managed to set a breakpoint in pthread_lock just prior to the read call that I know will fail (trying to see who else takes the lock) However what I observe is a large amount of buffer destructors beeing called like this:

#11 0xb7303da7 in calResFree () from /usr/lib/libaticalrt.so
#12 0xb7344c01 in CALBuffer::~CALBuffer () from /usr/lib/libbrook_cal.so
#13 0xb7337c21 in CALBufferMgr::_createPinnedBuffer () from /usr/lib/libbrook_cal.so
#14 0xb7338f07 in CALBufferMgr::setBufferData () from /usr/lib/libbrook_cal.so
#15 0xb7c97093 in StreamImpl::read () from /usr/lib/libbrook.so
#16 0xb7ca820c in brook::StreamData::read () from /usr/lib/libbrook.so
#17 0xb7cfa929 in brook::Stream<uint4>::read (this=0x9349368, ptr=0x935a300, flags=0xb7d0e8d8 "nocopy")
    at /usr/local/atibrook/sdk/include/brook/StreamDef.h:160
#18 0xb7cf819e in A5Slice::tick (this=0x90283c8) at A5Slice.cpp:369

This seems to indicate that the pinned buffers are accumulated in GFX memory and are only occasionally flushed. When this flusing occurs someone forgets to realease the mutex, and the next create call hangs indefinelty.

Where can I find the libbrook sources ? - I tried installing 1.4.1 but it fails on Ubuntu (on of the legacy samples has a dependancy on an old libpthread) - but the shared library is the same as that found in 1.4.0 (checked md5 sum)

Frank

 

 

Outcomes