[lttng-dev] New TLS usage in libgcc_s.so.1, compatibility impact
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Mon Jan 15 14:05:32 EST 2024
On 2024-01-13 07:49, Florian Weimer via lttng-dev wrote:
> This commit
>
> commit 8abddb187b33480d8827f44ec655f45734a1749d
> Author: Andrew Burgess <andrew.burgess at embecosm.com>
> Date: Sat Aug 5 14:31:06 2023 +0200
>
> libgcc: support heap-based trampolines
>
> Add support for heap-based trampolines on x86_64-linux, aarch64-linux,
> and x86_64-darwin. Implement the __builtin_nested_func_ptr_created and
> __builtin_nested_func_ptr_deleted functions for these targets.
>
> Co-Authored-By: Maxim Blinov <maxim.blinov at embecosm.com>
> Co-Authored-By: Iain Sandoe <iain at sandoe.co.uk>
> Co-Authored-By: Francois-Xavier Coudert <fxcoudert at gcc.gnu.org>
>
> added TLS usage to libgcc_s.so.1. The way that libgcc_s is currently
> built, it ends up using a dynamic TLS variant on the Linux targets.
> This means that there is no up-front TLS allocation with glibc (but
> there would be one with musl).
Trying to wrap my head around this:
If I get this right, the previous behavior was that glibc did allocate
global-dynamic variables from libraries which are preloaded and loaded
on c startup as if they were initial-exec, but now that libgcc_s.so.1
has a dynamic TLS variable, all those libraries loaded on c startup that
have global-dynamic TLS do not get the initial allocation special
treatment anymore. Is that more or less correct ?
(note: it's entirely possible that my understanding is entirely wrong,
please correct me if it's the case)
>
> There is still a compatibility impact because glibc assigns a TLS module
> ID upfront. This seems to be what causes the
> ust/libc-wrapper/test_libc-wrapper test in lttng-tools to fail. We end
> up with an infinite regress during process termination because
> libgcc_s.so.1 has been loaded, resulting in a DTV update. When this
> happens, the bottom of the stack looks like this:
>
> #4447 0x00007ffff7f288f0 in free () from /lib64/liblttng-ust-libc-wrapper.so.1
> #4448 0x00007ffff7fdb142 in free (ptr=<optimized out>)
> at ../include/rtld-malloc.h:50
> #4449 _dl_update_slotinfo (req_modid=3, new_gen=2) at ../elf/dl-tls.c:822
> #4450 0x00007ffff7fdb214 in update_get_addr (ti=0x7ffff7f2bfc0,
> gen=<optimized out>) at ../elf/dl-tls.c:916
> #4451 0x00007ffff7fddccc in __tls_get_addr ()
> at ../sysdeps/x86_64/tls_get_addr.S:55
> #4452 0x00007ffff7f288f0 in free () from /lib64/liblttng-ust-libc-wrapper.so.1
> #4453 0x00007ffff7fdb142 in free (ptr=<optimized out>)
> at ../include/rtld-malloc.h:50
> #4454 _dl_update_slotinfo (req_modid=2, new_gen=2) at ../elf/dl-tls.c:822
> #4455 0x00007ffff7fdb214 in update_get_addr (ti=0x7ffff7f39fa0,
> gen=<optimized out>) at ../elf/dl-tls.c:916
> #4456 0x00007ffff7fddccc in __tls_get_addr ()
> at ../sysdeps/x86_64/tls_get_addr.S:55
> #4457 0x00007ffff7f36113 in lttng_ust_cancelstate_disable_push ()
> from /lib64/liblttng-ust-common.so.1
> #4458 0x00007ffff7f4c2e8 in ust_lock_nocheck () from /lib64/liblttng-ust.so.1
> #4459 0x00007ffff7f5175a in lttng_ust_cleanup () from /lib64/liblttng-ust.so.1
> #4460 0x00007ffff7fca0f2 in _dl_call_fini (
> closure_map=closure_map at entry=0x7ffff7fbe000) at dl-call_fini.c:43
> #4461 0x00007ffff7fce06e in _dl_fini () at dl-fini.c:114
> #4462 0x00007ffff7d82fe6 in __run_exit_handlers () from /lib64/libc.so.6
>
> Cc:ing <lttng-dev at lists.lttng.org> for awareness.
I've prepared a change for lttng-ust to move the lttng-ust libc wrapper
"malloc nesting" guard variable from global-dynamic to initial-exec:
https://review.lttng.org/c/lttng-ust/+/11677 Fix: libc wrapper: use initial-exec for malloc_nesting TLS
This should help for the infinite recursion issue, but if my understanding
is correct about the impact of effectively changing the behavior used
for global-dynamic variables in preloaded and on-startup-loaded libraries
introduced by this libgcc change, I suspect we have other new issues here,
such as problems with async-signal safety of other global-dynamic variables
within LTTng-UST.
But moving all TLS variables used by lttng-ust from global-dynamic to
initial-exec is tricky, because a prior attempt to do so introduced regressions
in use-cases where lttng-ust was dlopen'd by Java or Python, AFAIU situations
where the runtimes were already using most of the extra memory pool for
dlopen'd libraries initial-exec variables, causing dlopen of lttng-ust
to fail.
Thanks Florian for letting us know about this,
Mathieu
>
> The issue also requires a recent glibc with changes to DTV management:
> commit d2123d68275acc0f061e73d5f86ca504e0d5a344 ("elf: Fix slow tls
> access after dlopen [BZ #19924]"). If I understand things correctly,
> before this glibc change, we didn't deallocate the old DTV, so there was
> no call to the free function.
>
> On the glibc side, we should recommend that intercepting mallocs and its
> dependencies use initial-exec TLS because that kind of TLS does not use
> malloc. If intercepting mallocs using dynamic TLS work at all, that's
> totally by accident, and was in the past helped by glibc bug 19924. (I
> don't think there is anything special about libgcc_s.so.1 that triggers
> the test failure above, it is just an object with dynamic TLS that is
> implicitly loaded via dlopen at the right stage of the test.) In this
> particular case, we can also paper over the test failure in glibc by not
> call free at all because the argument is a null pointer:
>
> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> index 7b3dd9ab60..14c71cbd06 100644
> --- a/elf/dl-tls.c
> +++ b/elf/dl-tls.c
> @@ -819,7 +819,8 @@ _dl_update_slotinfo (unsigned long int req_modid, size_t new_gen)
> dtv entry free it. Note: this is not AS-safe. */
> /* XXX Ideally we will at some point create a memory
> pool. */
> - free (dtv[modid].pointer.to_free);
> + if (dtv[modid].pointer.to_free != NULL)
> + free (dtv[modid].pointer.to_free);
> dtv[modid].pointer.val = TLS_DTV_UNALLOCATED;
> dtv[modid].pointer.to_free = NULL;
>
> As the comment hints, we shouldn't be using malloc for TLS memory at all
> because it is not AS-safe, but that's a long-term change. This change
> seems rather specific to this particular test case failure because it
> relies on libgcc_s.so.1 never using TLS before it gets unloaded.
>
> Regarding the libgcc_s side, I'm not sure if the TLS usage there should
> be considered a real problem, although I'm a bit nervous about it.
> However, the current implementation caches one page of trampolines past
> the outermost nested function pointer deallocation (otherwise creating
> one function pointer per thread in a loop would be really expensive).
> It looks to me that is never freed, so if the thread exits even with
> proper unwinding (e.g., on glibc with code compiled with -fexceptions),
> there is a memory leak. Integration with glibc could avoid this issue,
> and also help with the longjmp problem, and fix setcontext/swapcontext,
> too.
>
> Thanks,
> Florian
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
More information about the lttng-dev
mailing list