[lttng-dev] Deaklock in liblttng-ust

Chang, Zheng Zheng.Chang at emc.com
Thu Sep 13 03:27:11 EDT 2012


Hi,

 

I built a trace.so as a wrapper of lttng-ust, which predefines some
events and APIs based on lttng-ust.

And here is demo application linked to this share library. 

 

Sometimes the demo hung at launch time. I did test with the demo and
easy-ust of lttng-ust on both IA32 and X64 and got the same result.

Lttng-ust version is 2.02. 

 

I collect some debuging info with gdb here:

 

Parent process:

 

gdb) thread 3  (constructor of liblttng-ust.so)

[Switching to thread 3 (Thread 0x7f91d5487950 (LWP 21901))]#0
0x00007f91d82ad400 in wait () from /lib64/libpthread.so.0

(gdb) bt

#0  0x00007f91d82ad400 in wait () from /lib64/libpthread.so.0

#1  0x00007f91d64a8c4b in wait_for_sessiond (sock_info=0x7f91d66cc640)
at lttng-ust-comm.c:481

#2  0x00007f91d64a9545 in ust_listener_thread (arg=<value optimized
out>) at lttng-ust-comm.c:669

#3  0x00007f91d82a5650 in start_thread () from /lib64/libpthread.so.0

#4  0x00007f91d7b0315d in clone () from /lib64/libc.so.6

 

(gdb) thread 5    (constructor of demo)

[Switching to thread 5 (Thread 0x7f91d690e950 (LWP 21899))]#0
0x00007f91d82ac344 in __lll_lock_wait () from /lib64/libpthread.so.0

(gdb) bt

#0  0x00007f91d82ac344 in __lll_lock_wait () from /lib64/libpthread.so.0

#1  0x00007f91d82a72d0 in _L_lock_102 () from /lib64/libpthread.so.0

#2  0x00007f91d82a6bbe in pthread_mutex_lock () from
/lib64/libpthread.so.0

#3  0x00007f91d64abf8b in ltt_probe_register (desc=0x7f91d66d0ce0) at
ltt-probes.c:77

#4  0x00007f91d66e0cfb in __lttng_events_init__sample_component () at
/usr/include/lttng/ust-tracepoint-event.h:550

#5  0x00007f91d66e71b6 in __do_global_ctors_aux () from mytrace.so

#6  0x00007f91d66e03c3 in _init () from mytrace.so

#7  0x00007f91d66df3d4 in ?? () from mytrace.so

#8  0x00007f91d94f78d8 in ?? () from /lib64/ld-linux-x86-64.so.2

#9  0x00007f91d94f7a07 in ?? () from /lib64/ld-linux-x86-64.so.2

#10 0x00007f91d94fbbde in ?? () from /lib64/ld-linux-x86-64.so.2

#11 0x00007f91d94f7566 in ?? () from /lib64/ld-linux-x86-64.so.2

#12 0x00007f91d94fb38b in ?? () from /lib64/ld-linux-x86-64.so.2

#13 0x00007f91d84bbf9b in ?? () from /lib64/libdl.so.2

#14 0x00007f91d94f7566 in ?? () from /lib64/ld-linux-x86-64.so.2

#15 0x00007f91d84bc34c in ?? () from /lib64/libdl.so.2

#16 0x00007f91d84bbf01 in dlopen () from /lib64/libdl.so.2

..............................................

#24 0x00007f91d82a5650 in start_thread () from /lib64/libpthread.so.0

#25 0x00007f91d7b0315d in clone () from /lib64/libc.so.6

 

(gdb) p sessions_mutex

$4 = {__data = {__lock = 2, __count = 0, __owner = 21901, __nusers = 1,
__kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 

  __size = "\002\000\000\000\000\000\000\000\215U\000\000\001", '\0'
<repeats 26 times>, __align = 2}

 

Child process:

 

(gdb) info thread    (forked from 21901)

* 1 Thread 0x7f91d5487950 (LWP 21902)  0x00007f91d82ac344 in
__lll_lock_wait () from /lib64/libpthread.so.0

(gdb) bt

#0  0x00007f91d82ac344 in __lll_lock_wait () from /lib64/libpthread.so.0

#1  0x00007f91d82a72d0 in _L_lock_102 () from /lib64/libpthread.so.0

#2  0x00007f91d82a6bbe in pthread_mutex_lock () from
/lib64/libpthread.so.0

#3  0x00007f91d64abe58 in ltt_probe_unregister (desc=0x7f91d66d0ce0) at
ltt-probes.c:129

#4  0x00007f91d66e0d10 in __lttng_events_exit__sample_component () at
/usr/include/lttng/ust-tracepoint-event.h:557

#5  0x00007f91d66e07cf in __do_global_dtors_aux () from mytrace.so

#6  0x00007f91d66cd664 in global_apps () from /usr/lib/liblttng-ust.so.0

#7  0x00007f91d5479df0 in ?? ()

#8  0x00007f91d66e71dd in _real_fini () from mytrace.so

#9  0x00007f91d66e71d2 in _fini () from mytrace.so

#10 0x00007f91d94f7f54 in ?? () from /lib64/ld-linux-x86-64.so.2

#11 0x00007f91d7a652ed in exit () from /lib64/libc.so.6

#12 0x00007f91d64a9207 in wait_for_sessiond (sock_info=0x7f91d66cc640)
at lttng-ust-comm.c:542

#13 0x00007f91d64a9545 in ust_listener_thread (arg=<value optimized
out>) at lttng-ust-comm.c:669

#14 0x00007f91d82a5650 in start_thread () from /lib64/libpthread.so.0

#15 0x00007f91d7b0315d in clone () from /lib64/libc.so.6

#16 0x0000000000000000 in ?? ()

(gdb) p sessions_mutex

$1 = {__data = {__lock = 2, __count = 0, __owner = 21901, __nusers = 1,
__kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 

  __size = "\002\000\000\000\000\000\000\000\215U\000\000\001", '\0'
<repeats 26 times>, __align = 2}

 

What happened seems like this:

 

lttng_ust_init

       |

       |-> ust_listener_thread 

                                            |

                                            |-> wait_for_sessiond 

                                                                   |

                                                                   |->
ust_lock
(1)

                                                                   |->
get_map_shm 

                                                                   |
|

                                                                   |
|-> get_wait_shm 

                                                                   |
|

                                                                   |
|-> fork

                                                                   |
parent -> wait
(2)

                                                                   |
child  -> exit -> _fini -> __do_global_dtors_aux -> ...... ->
ltt_probe_unregister -> ust_lock            (3)

                                                                   |->
ust_unlock
(4)

 

Deadlock happened at point (1) and (3). Parent waited for child's
termination and child waited for parent to release the lock.

 

Reproduction conditions:

-          First time to create share memory
(/dev/shm/lttng-ust-apps-wait* don't exist)

-          Child process got delayed( I'm not quite sure with this, I
used gdb to hold child process for a while and it happened either)

 

In normal case, child process didn't call _fini when it exited so that
no deadlock happened.

 

Is this a known issue?

 

Thanks

-Zheng

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20120913/b2a721ab/attachment-0001.html>


More information about the lttng-dev mailing list