[lttng-dev] Setenv/getenv are not thread-safe; whose bug is it?

Douglas Graham douglas.graham at ericsson.com
Fri Mar 10 22:21:59 UTC 2017


Hi,

We have an application that uses lttng-ust for logging.  We are seeing a crash in getenv here:

#0  __GI_getenv (name=0xb6eb06de "TNG_UST_WITHOUT_BADDR_STATEDUMP") at getenv.c:85
#1  0xb6e7b350 in do_baddr_statedump (owner=0xb6ecf300 <global_apps>) at lttng-ust-statedump.c:315
#2  do_lttng_ust_statedump (owner=owner at entry=0xb6ecf300 <global_apps>)  at lttng-ust-statedump.c:341
#3  0xb6e71ef4 in lttng_handle_pending_statedump (owner=owner at entry=0xb6ecf300 <global_apps>)  at lttng-events.c:856
#4  0xb6e690ac in handle_pending_statedump (sock_info=0xb6ecf300 <global_apps>) at lttng-ust-comm.c:581
#5  handle_message (lum=0xb48fe66c, sock=<optimized out>, sock_info=<optimized out>)  at lttng-ust-comm.c:966
#6  ust_listener_thread (arg=0xb6ecf300 <global_apps>)  at lttng-ust-comm.c:1490
#7  0xb6e33f6c in start_thread (arg=0xb48ff220) at pthread_create.c:339

The core shows that this thread is one of three threads in the child process just after a fork().  After the fork(), the one application thread in the child calls setenv() to set up the environment, and then execs another program.  The problem is that setenv() is not thread-safe, especially if it requires the environment vector to be resized.  If the application thread calls setenv() to add a new environment  variable at the same time that getenv is called by this lttng listener thread, bad things can happen. The setenv can cause the environment vector to be resized at the same time it is being searched, which causes getenv go off into the weeds.

I assume that this listener thread is created because we have preloaded libttng-ust-fork, and I see no reason that this particular process really needs to preload that library, so one workaround is probably to just remove it.  The problem is that this process inherits LD_PRELOAD from a parent process (one similar to init) that launches many other daemons, some that might actually require liblttng-ust-fork, so removing this library from the process that is crashing is not entirely trivial.  And this crash also raises the question of whether we could encounter similar crashes in other processes that use liblttng-ust.  It's only after intensive testing for many hours that we see this crash.

Would it be safe to say that it is probably a bug for an lttng thread to make a call to a non thread-safe function like getenv()?  What's the best way to fix this?

Thanks,
Doug





More information about the lttng-dev mailing list