[lttng-dev] [PATCH lttng-ust] Fix: destructors do not run on probe provider dlclose
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Thu Feb 22 08:34:00 EST 2018
----- On Feb 21, 2018, at 5:10 PM, Francis Deslauriers francis.deslauriers at efficios.com wrote:
> Calling dlclose on a probe provider library does not unregister the
> probes from the callsites as the destructors are not executed.
please specify that this is only true for the first shared object loading
the __tracepoints__disable_destructors into the symbol table. Otherwise,
the sentence above is unclear, and we can be led to think that the destructor
is _never_ executed.
>
> The __tracepoints__disable_destructors weak symbol was exposed by probe
please use current tense: "is exposed"
> providers, liblttng-ust.so and liblttng-ust-tracepoint.so libraries. If
> a probe provider was loaded first into the address space, its definition
"is loaded"
> would be binded to the symbol. All the subsequent libraries using the
"is bound"
> symbol would use the existing definition of the symbol. Thus creating a
", thus..."
> a situation where liblttng-ust.so or liblttng-ust-tracepoint.so would
> have a dependency on the probe provider library.
would have a dependency -> depend
>
> This was preventing the dynamic loader from unloading the library as it
was preventing -> prevents
> was still in use by other libraries. Because of this, the execution of
is still
> its destructors and the unregistration of the probes was postponed.
is postponed
>
> To overcome this issue, we no longer expose this symbol in the
> tracepoint.h file to remove the explicit dependency of the probe
> provider on the symbol. We instead use the existing dlopen handle on
> liblttng-ust-tracepoint.so to get an handle on a getter and setter
> functions for this value.
>
> Signed-off-by: Francis Deslauriers <francis.deslauriers at efficios.com>
> ---
> include/lttng/tracepoint.h | 31 ++++++++++++++-----------------
> liblttng-ust/tracepoint.c | 26 ++++++++++++++++++++++++++
> 2 files changed, 40 insertions(+), 17 deletions(-)
>
> diff --git a/include/lttng/tracepoint.h b/include/lttng/tracepoint.h
> index 39f2c4d..95f5de9 100644
> --- a/include/lttng/tracepoint.h
> +++ b/include/lttng/tracepoint.h
> @@ -221,26 +221,13 @@ struct lttng_ust_tracepoint_dlopen {
> void (*rcu_read_lock_sym_bp)(void);
> void (*rcu_read_unlock_sym_bp)(void);
> void *(*rcu_dereference_sym_bp)(void *p);
> + void (*tracepoint_set_destructors_disabled)(int is_disabled);
If we expose a "is_disabled" boolean as argument here, it means we want
to allow application to disable and eventually re-enable destructors.
It makes sense, but then we need to keep a reference count within
liblttng-ust-tracepoint.so rather than a simple 1/0 state. Can you
change that ?
This would then be exposed by 3 functions:
Initial state: destructors are enabled, refcount = 1.
void tracepoint_disable_destructors(void)
void tracepoint_enable_destructors(void)
bool tracepoint_get_destructors_state(void)
> + int (*tracepoint_get_destructors_disabled)(void);
> };
Those added fields to:
struct lttng_ust_tracepoint_dlopen tracepoint_dlopen
__attribute__((weak, visibility("hidden")));
mean that if someone has an old .o or .a laying around compiled with the
prior tracepoint_dlopen definition, and links it against a new definition,
there is a mismatch. Arguably, it's limited to within a .so or executable
(module), so typically people recompile, but it won't work if an application
is statically linked against a library: we will have a layout mismatch for
that symbol.
A safer alternative would be to leave this structure unchanged, and have
a new structure that contains those new fields.
>
> extern struct lttng_ust_tracepoint_dlopen tracepoint_dlopen;
> extern struct lttng_ust_tracepoint_dlopen *tracepoint_dlopen_ptr;
>
> -/* Disable tracepoint destructors. */
> -int __tracepoints__disable_destructors __attribute__((weak));
> -
> -/*
> - * Programs that have threads that survive after they exit, and
> - * therefore call library destructors, should disable the tracepoint
> - * destructors by calling tracepoint_disable_destructors(). This will
> - * leak the tracepoint instrumentation library shared object, leaving
> - * its teardown to the operating system process teardown.
> - */
> -static inline void tracepoint_disable_destructors(void)
> -{
> - __tracepoints__disable_destructors = 1;
> -}
> -
> /*
> * These weak symbols, the constructor, and destructor take care of
> * registering only _one_ instance of the tracepoints per shared-ojbect
> @@ -335,7 +322,8 @@ __tracepoints__destroy(void)
> return;
> if (!tracepoint_dlopen_ptr)
> tracepoint_dlopen_ptr = &tracepoint_dlopen;
> - if (!__tracepoints__disable_destructors
> + if (tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled
> + && !tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled()
> && tracepoint_dlopen_ptr->liblttngust_handle
you can move the check for liblttngust_handle as first check in the if().
> && !__tracepoint_ptrs_registered) {
> ret = dlclose(tracepoint_dlopen_ptr->liblttngust_handle);
> @@ -423,6 +411,14 @@ __tracepoints__ptrs_init(void)
> URCU_FORCE_CAST(int (*)(struct lttng_ust_tracepoint * const *),
> dlsym(tracepoint_dlopen_ptr->liblttngust_handle,
> "tracepoint_unregister_lib"));
> + tracepoint_dlopen_ptr->tracepoint_set_destructors_disabled =
> + URCU_FORCE_CAST(void (*)(int),
> + dlsym(tracepoint_dlopen_ptr->liblttngust_handle,
> + "tracepoint_set_destructors_disabled"));
> + tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled =
> + URCU_FORCE_CAST(int (*)(void),
> + dlsym(tracepoint_dlopen_ptr->liblttngust_handle,
> + "tracepoint_get_destructors_disabled"));
> __tracepoint__init_urcu_sym();
> if (tracepoint_dlopen_ptr->tracepoint_register_lib) {
> tracepoint_dlopen_ptr->tracepoint_register_lib(__start___tracepoints_ptrs,
> @@ -444,7 +440,8 @@ __tracepoints__ptrs_destroy(void)
> tracepoint_dlopen_ptr = &tracepoint_dlopen;
> if (tracepoint_dlopen_ptr->tracepoint_unregister_lib)
> tracepoint_dlopen_ptr->tracepoint_unregister_lib(__start___tracepoints_ptrs);
> - if (!__tracepoints__disable_destructors
> + if (tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled
> + && !tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled()
> && tracepoint_dlopen_ptr->liblttngust_handle
same here.
> && !__tracepoint_ptrs_registered) {
> ret = dlclose(tracepoint_dlopen_ptr->liblttngust_handle);
> diff --git a/liblttng-ust/tracepoint.c b/liblttng-ust/tracepoint.c
> index 2c25d46..5d9e10e 100644
> --- a/liblttng-ust/tracepoint.c
> +++ b/liblttng-ust/tracepoint.c
> @@ -53,6 +53,22 @@ struct {
> /* Set to 1 to enable tracepoint debug output */
> static const int tracepoint_debug;
> static int initialized;
> +
> +/*
> + * Programs that have threads that survive after they exit, and
> + * therefore call library destructors, should disable the tracepoint
> + * destructors by calling tracepoints_set_destructors_disabled(). This will
> + * leak the tracepoint instrumentation library shared object, leaving its
> + * teardown to the operating system process teardown.
> + *
> + * Set to 1 to disable the execution of tracepoint library destructors.
> + *
> + * To access and/or modify this value, users need to use a combination of
> + * dlopen(3) and dlsym(3) to get an handle on the
> + * tracepoint_set_destructors_disabled and tracepoint_get_destructors_disabled
> + * symbols below.
> + */
> +static int tracepoint_destructors_disabled;
This would become:
static int tracepoint_destructors_refcount = 1;
And accessed through uatomic_*() APIs. I would use uatomic
rather than liburcu refcount API because we don't want to run
anything when the refcount reaches 0.
Thanks,
Mathieu
> static void (*new_tracepoint_cb)(struct lttng_ust_tracepoint *);
>
> /*
> @@ -166,6 +182,16 @@ static void debug_print_probes(struct tracepoint_entry
> *entry)
> DBG("Probe %d : %p", i, entry->probes[i].func);
> }
>
> +void tracepoint_set_destructors_disabled(int is_disabled)
> +{
> + tracepoint_destructors_disabled = is_disabled;
> +}
> +
> +int tracepoint_get_destructors_disabled(void)
> +{
> + return tracepoint_destructors_disabled;
> +}
> +
> static void *
> tracepoint_entry_add_probe(struct tracepoint_entry *entry,
> void (*probe)(void), void *data)
> --
> 2.7.4
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list