[lttng-dev] Application freeze at startup with LTTng-UST and Perl dlopen() - TRACEPOINT_PROBE_DYNAMIC_LINKAGE?

David OShea David.OShea at quantum.com
Sun Feb 24 20:17:03 EST 2013


Hi all,

I'm getting freezes at application startup with LTTng-UST which I assume are due to my lack of understanding of TRACEPOINT_PROBE_DYNAMIC_LINKAGE.

I'm running:
- lttng-tools-2.0.4
- lttng-ust-2.0.5 patched with 009745db "Cache the procname per-thread rather than per-process to take into account that prctl() can be used to set thread names." and e699eda "don't spawn per-user thread if HOME is not set".
- liburcu0-0.7.4-0

This is being run on CentOS 5.4.

The lttng-ust man page says:

"""
BUILDING/LINKING THE TRACEPOINT PROVIDER
[...]
         1.1) Compile the Tracepoint provider with the application, either
              directly or through a static library (.a):
[...]
         2) Compile the Tracepoint Provider separately from the application,
            using dynamic linking:
           - Into exactly one object of your application: define
             "TRACEPOINT_DEFINE" _and_ also define
             "TRACEPOINT_PROBE_DYNAMIC_LINKAGE", then include the tracepoint
             provider header.	
[...]
"""

I have a dynamic library that includes code with tracepoint() calls, plus the relevant tracepoint definitions and trace providers - it is sort of "self-contained".  My Perl script dlopen()s this dynamic library.  Am I correct in thinking that this falls into option 2) above, because whilst the trace provider is statically linked with the code that uses the tracepoints, the trace provider is being dynamically linked into the application, and that is the part that matters to LTTng? 
 
When I build this dynamic library without TRACEPOINT_PROBE_DYNAMIC_LINKAGE #defined in the tracepoint definitions, I get a freeze that looks like https://bugs.lttng.org/issues/142 from the strace output:

"""
open("/usr/lib64/liburcu-common.so.1", O_RDONLY) = 4
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\7`\0310\0\0\0"..., 832) = 832
fstat(4, {st_mode=S_IFREG|0755, st_size=9464, ...}) = 0
mmap(0x3019600000, 2102552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) = 0x3019600000
mprotect(0x3019602000, 2093056, PROT_NONE) = 0
mmap(0x3019801000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x1000) = 0x3019801000
close(4)                                = 0
mprotect(0x3fd3607000, 4096, PROT_READ) = 0
munmap(0x2ae2a2766000, 49028)           = 0
open("/dev/urandom", O_RDONLY)          = 4
fcntl(4, F_GETFD)                       = 0
fcntl(4, F_SETFD, FD_CLOEXEC)           = 0
getuid()                                = 0
getppid()                               = 8809
read(4, "\266\310we\347u\31\336\33\266 \262C\3267\32", 16) = 16
gettid()                                = 8810
clock_gettime(CLOCK_REALTIME, {1361753274, 392351000}) = 0
getuid()                                = 0
geteuid()                               = 0
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
mmap(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, -1, 0) = 0x4023d000
mprotect(0x4023d000, 4096, PROT_NONE)   = 0
clone(child_stack=0x40c3d250, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x40c3d9d0, tls=0x40c3d940, child_tidptr=0x40c3d9d0) = 8811
mmap(NULL, 10489856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, -1, 0) = 0x41e43000
mprotect(0x41e43000, 4096, PROT_NONE)   = 0
clone(child_stack=0x42843250, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x428439d0, tls=0x42843940, child_tidptr=0x428439d0) = 8812
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
futex(0x2ae2a29a5a80, FUTEX_WAIT_PRIVATE, 0, {2, 999506000}) = -1 ETIMEDOUT (Connection timed out)
gettid()                                = 8810
write(2, "libust[8810/8810]: Error: Timed "..., 107libust[8810/8810]: Error: Timed out waiting for ltt-sessiond (in lttng_ust_init() at lttng-ust-comm.c:952)
) = 107
futex(0x2ae2a29a5b00, FUTEX_WAIT_PRIVATE, 2, NULL
"""
(freeze at this point)

I realise that bug 142 is fixed in the version of the code I'm running.

I also found that, even though the Perl script does not fork() (unless I'm very much mistaken), if I 'export LD_PRELOAD=/usr/lib64/liblttng-ust-fork.so.0' and then run a shell script that invokes 'perl ...' to start the script, the problem goes away.  Is this because, when the shell fork()s to start Perl, liblttng-ust-fork.so.0 will cause LTTng to be initialised in the new process much earlier than is the case with just the dlopen()? 

I played around with another script and dynamic library combination and found a similar problem occurred on CentOS 5.4, but when I used the same LTTng-UST and related libraries on Fedora 16, I didn't get the freeze.

Finally, I noticed that commit ID b834deadbfa8a78ae1d00440fd91c41dfd351eba changed README to remove a note suggesting that dlopen() not be used.  It looks like the same change should be made to doc/man/lttng-ust.3, which still suggests that you not use dlopen().

Thanks in advance,
David

----------------------------------------------------------------------
The information contained in this transmission may be confidential. Any disclosure, copying, or further distribution of confidential information is not permitted unless such privilege is explicitly granted in writing by Quantum. Quantum reserves the right to have electronic communications, including email and attachments, sent across its networks filtered through anti virus and spam software programs and retain such messages in order to comply with applicable data security and retention requirements. Quantum is not responsible for the proper and complete transmission of the substance of this communication or for any delay in its receipt.



More information about the lttng-dev mailing list