[lttng-dev] Deaklock in liblttng-ust
Chang, Zheng
Zheng.Chang at emc.com
Thu Sep 13 03:27:11 EDT 2012
Hi,
I built a trace.so as a wrapper of lttng-ust, which predefines some
events and APIs based on lttng-ust.
And here is demo application linked to this share library.
Sometimes the demo hung at launch time. I did test with the demo and
easy-ust of lttng-ust on both IA32 and X64 and got the same result.
Lttng-ust version is 2.02.
I collect some debuging info with gdb here:
Parent process:
gdb) thread 3 (constructor of liblttng-ust.so)
[Switching to thread 3 (Thread 0x7f91d5487950 (LWP 21901))]#0
0x00007f91d82ad400 in wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x00007f91d82ad400 in wait () from /lib64/libpthread.so.0
#1 0x00007f91d64a8c4b in wait_for_sessiond (sock_info=0x7f91d66cc640)
at lttng-ust-comm.c:481
#2 0x00007f91d64a9545 in ust_listener_thread (arg=<value optimized
out>) at lttng-ust-comm.c:669
#3 0x00007f91d82a5650 in start_thread () from /lib64/libpthread.so.0
#4 0x00007f91d7b0315d in clone () from /lib64/libc.so.6
(gdb) thread 5 (constructor of demo)
[Switching to thread 5 (Thread 0x7f91d690e950 (LWP 21899))]#0
0x00007f91d82ac344 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x00007f91d82ac344 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007f91d82a72d0 in _L_lock_102 () from /lib64/libpthread.so.0
#2 0x00007f91d82a6bbe in pthread_mutex_lock () from
/lib64/libpthread.so.0
#3 0x00007f91d64abf8b in ltt_probe_register (desc=0x7f91d66d0ce0) at
ltt-probes.c:77
#4 0x00007f91d66e0cfb in __lttng_events_init__sample_component () at
/usr/include/lttng/ust-tracepoint-event.h:550
#5 0x00007f91d66e71b6 in __do_global_ctors_aux () from mytrace.so
#6 0x00007f91d66e03c3 in _init () from mytrace.so
#7 0x00007f91d66df3d4 in ?? () from mytrace.so
#8 0x00007f91d94f78d8 in ?? () from /lib64/ld-linux-x86-64.so.2
#9 0x00007f91d94f7a07 in ?? () from /lib64/ld-linux-x86-64.so.2
#10 0x00007f91d94fbbde in ?? () from /lib64/ld-linux-x86-64.so.2
#11 0x00007f91d94f7566 in ?? () from /lib64/ld-linux-x86-64.so.2
#12 0x00007f91d94fb38b in ?? () from /lib64/ld-linux-x86-64.so.2
#13 0x00007f91d84bbf9b in ?? () from /lib64/libdl.so.2
#14 0x00007f91d94f7566 in ?? () from /lib64/ld-linux-x86-64.so.2
#15 0x00007f91d84bc34c in ?? () from /lib64/libdl.so.2
#16 0x00007f91d84bbf01 in dlopen () from /lib64/libdl.so.2
..............................................
#24 0x00007f91d82a5650 in start_thread () from /lib64/libpthread.so.0
#25 0x00007f91d7b0315d in clone () from /lib64/libc.so.6
(gdb) p sessions_mutex
$4 = {__data = {__lock = 2, __count = 0, __owner = 21901, __nusers = 1,
__kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\002\000\000\000\000\000\000\000\215U\000\000\001", '\0'
<repeats 26 times>, __align = 2}
Child process:
(gdb) info thread (forked from 21901)
* 1 Thread 0x7f91d5487950 (LWP 21902) 0x00007f91d82ac344 in
__lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x00007f91d82ac344 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007f91d82a72d0 in _L_lock_102 () from /lib64/libpthread.so.0
#2 0x00007f91d82a6bbe in pthread_mutex_lock () from
/lib64/libpthread.so.0
#3 0x00007f91d64abe58 in ltt_probe_unregister (desc=0x7f91d66d0ce0) at
ltt-probes.c:129
#4 0x00007f91d66e0d10 in __lttng_events_exit__sample_component () at
/usr/include/lttng/ust-tracepoint-event.h:557
#5 0x00007f91d66e07cf in __do_global_dtors_aux () from mytrace.so
#6 0x00007f91d66cd664 in global_apps () from /usr/lib/liblttng-ust.so.0
#7 0x00007f91d5479df0 in ?? ()
#8 0x00007f91d66e71dd in _real_fini () from mytrace.so
#9 0x00007f91d66e71d2 in _fini () from mytrace.so
#10 0x00007f91d94f7f54 in ?? () from /lib64/ld-linux-x86-64.so.2
#11 0x00007f91d7a652ed in exit () from /lib64/libc.so.6
#12 0x00007f91d64a9207 in wait_for_sessiond (sock_info=0x7f91d66cc640)
at lttng-ust-comm.c:542
#13 0x00007f91d64a9545 in ust_listener_thread (arg=<value optimized
out>) at lttng-ust-comm.c:669
#14 0x00007f91d82a5650 in start_thread () from /lib64/libpthread.so.0
#15 0x00007f91d7b0315d in clone () from /lib64/libc.so.6
#16 0x0000000000000000 in ?? ()
(gdb) p sessions_mutex
$1 = {__data = {__lock = 2, __count = 0, __owner = 21901, __nusers = 1,
__kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\002\000\000\000\000\000\000\000\215U\000\000\001", '\0'
<repeats 26 times>, __align = 2}
What happened seems like this:
lttng_ust_init
|
|-> ust_listener_thread
|
|-> wait_for_sessiond
|
|->
ust_lock
(1)
|->
get_map_shm
|
|
|
|-> get_wait_shm
|
|
|
|-> fork
|
parent -> wait
(2)
|
child -> exit -> _fini -> __do_global_dtors_aux -> ...... ->
ltt_probe_unregister -> ust_lock (3)
|->
ust_unlock
(4)
Deadlock happened at point (1) and (3). Parent waited for child's
termination and child waited for parent to release the lock.
Reproduction conditions:
- First time to create share memory
(/dev/shm/lttng-ust-apps-wait* don't exist)
- Child process got delayed( I'm not quite sure with this, I
used gdb to hold child process for a while and it happened either)
In normal case, child process didn't call _fini when it exited so that
no deadlock happened.
Is this a known issue?
Thanks
-Zheng
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20120913/b2a721ab/attachment-0001.html>
More information about the lttng-dev
mailing list