[lttng-dev] [PATCH lttng-ust] Fix: destructors do not run on probe provider dlclose

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Thu Feb 22 08:34:00 EST 2018


----- On Feb 21, 2018, at 5:10 PM, Francis Deslauriers francis.deslauriers at efficios.com wrote:

> Calling dlclose on a probe provider library does not unregister the
> probes from the callsites as the destructors are not executed.

please specify that this is only true for the first shared object loading
the __tracepoints__disable_destructors into the symbol table. Otherwise,
the sentence above is unclear, and we can be led to think that the destructor
is _never_ executed.


> 
> The __tracepoints__disable_destructors weak symbol was exposed by probe

please use current tense: "is exposed"

> providers, liblttng-ust.so and liblttng-ust-tracepoint.so libraries. If
> a probe provider was loaded first into the address space, its definition

"is loaded"

> would be binded to the symbol. All the subsequent libraries using the

"is bound"

> symbol would use the existing definition of the symbol. Thus creating a

", thus..."

> a situation where liblttng-ust.so or liblttng-ust-tracepoint.so would
> have a dependency on the probe provider library.

would have a dependency -> depend

> 
> This was preventing the dynamic loader from unloading the library as it

was preventing -> prevents

> was still in use by other libraries. Because of this, the execution of

is still

> its destructors and the unregistration of the probes was postponed.

is postponed

> 
> To overcome this issue, we no longer expose this symbol in the
> tracepoint.h file to remove the explicit dependency of the probe
> provider on the symbol. We instead use the existing dlopen handle on
> liblttng-ust-tracepoint.so to get an handle on a getter and setter
> functions for this value.
> 
> Signed-off-by: Francis Deslauriers <francis.deslauriers at efficios.com>
> ---
> include/lttng/tracepoint.h | 31 ++++++++++++++-----------------
> liblttng-ust/tracepoint.c  | 26 ++++++++++++++++++++++++++
> 2 files changed, 40 insertions(+), 17 deletions(-)
> 
> diff --git a/include/lttng/tracepoint.h b/include/lttng/tracepoint.h
> index 39f2c4d..95f5de9 100644
> --- a/include/lttng/tracepoint.h
> +++ b/include/lttng/tracepoint.h
> @@ -221,26 +221,13 @@ struct lttng_ust_tracepoint_dlopen {
> 	void (*rcu_read_lock_sym_bp)(void);
> 	void (*rcu_read_unlock_sym_bp)(void);
> 	void *(*rcu_dereference_sym_bp)(void *p);
> +	void (*tracepoint_set_destructors_disabled)(int is_disabled);

If we expose a "is_disabled" boolean as argument here, it means we want
to allow application to disable and eventually re-enable destructors.
It makes sense, but then we need to keep a reference count within
liblttng-ust-tracepoint.so rather than a simple 1/0 state. Can you
change that ?

This would then be exposed by 3 functions:

Initial state: destructors are enabled, refcount = 1.

void tracepoint_disable_destructors(void)
void tracepoint_enable_destructors(void)
bool tracepoint_get_destructors_state(void)

> +	int (*tracepoint_get_destructors_disabled)(void);
> };

Those added fields to:

struct lttng_ust_tracepoint_dlopen tracepoint_dlopen
        __attribute__((weak, visibility("hidden")));

mean that if someone has an old .o or .a laying around compiled with the
prior tracepoint_dlopen definition, and links it against a new definition,
there is a mismatch. Arguably, it's limited to within a .so or executable
(module), so typically people recompile, but it won't work if an application
is statically linked against a library: we will have a layout mismatch for
that symbol.

A safer alternative would be to leave this structure unchanged, and have
a new structure that contains those new fields.

> 
> extern struct lttng_ust_tracepoint_dlopen tracepoint_dlopen;
> extern struct lttng_ust_tracepoint_dlopen *tracepoint_dlopen_ptr;
> 
> -/* Disable tracepoint destructors. */
> -int __tracepoints__disable_destructors __attribute__((weak));
> -
> -/*
> - * Programs that have threads that survive after they exit, and
> - * therefore call library destructors, should disable the tracepoint
> - * destructors by calling tracepoint_disable_destructors(). This will
> - * leak the tracepoint instrumentation library shared object, leaving
> - * its teardown to the operating system process teardown.
> - */
> -static inline void tracepoint_disable_destructors(void)
> -{
> -	__tracepoints__disable_destructors = 1;
> -}
> -
> /*
>  * These weak symbols, the constructor, and destructor take care of
>  * registering only _one_ instance of the tracepoints per shared-ojbect
> @@ -335,7 +322,8 @@ __tracepoints__destroy(void)
> 		return;
> 	if (!tracepoint_dlopen_ptr)
> 		tracepoint_dlopen_ptr = &tracepoint_dlopen;
> -	if (!__tracepoints__disable_destructors
> +	if (tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled
> +			&& !tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled()
> 			&& tracepoint_dlopen_ptr->liblttngust_handle

you can move the check for liblttngust_handle as first check in the if().

> 			&& !__tracepoint_ptrs_registered) {
> 		ret = dlclose(tracepoint_dlopen_ptr->liblttngust_handle);
> @@ -423,6 +411,14 @@ __tracepoints__ptrs_init(void)
> 		URCU_FORCE_CAST(int (*)(struct lttng_ust_tracepoint * const *),
> 				dlsym(tracepoint_dlopen_ptr->liblttngust_handle,
> 					"tracepoint_unregister_lib"));
> +	tracepoint_dlopen_ptr->tracepoint_set_destructors_disabled =
> +		URCU_FORCE_CAST(void (*)(int),
> +				dlsym(tracepoint_dlopen_ptr->liblttngust_handle,
> +					"tracepoint_set_destructors_disabled"));
> +	tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled =
> +		URCU_FORCE_CAST(int (*)(void),
> +				dlsym(tracepoint_dlopen_ptr->liblttngust_handle,
> +					"tracepoint_get_destructors_disabled"));
> 	__tracepoint__init_urcu_sym();
> 	if (tracepoint_dlopen_ptr->tracepoint_register_lib) {
> 		tracepoint_dlopen_ptr->tracepoint_register_lib(__start___tracepoints_ptrs,
> @@ -444,7 +440,8 @@ __tracepoints__ptrs_destroy(void)
> 		tracepoint_dlopen_ptr = &tracepoint_dlopen;
> 	if (tracepoint_dlopen_ptr->tracepoint_unregister_lib)
> 		tracepoint_dlopen_ptr->tracepoint_unregister_lib(__start___tracepoints_ptrs);
> -	if (!__tracepoints__disable_destructors
> +	if (tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled
> +			&& !tracepoint_dlopen_ptr->tracepoint_get_destructors_disabled()
> 			&& tracepoint_dlopen_ptr->liblttngust_handle

same here.

> 			&& !__tracepoint_ptrs_registered) {
> 		ret = dlclose(tracepoint_dlopen_ptr->liblttngust_handle);
> diff --git a/liblttng-ust/tracepoint.c b/liblttng-ust/tracepoint.c
> index 2c25d46..5d9e10e 100644
> --- a/liblttng-ust/tracepoint.c
> +++ b/liblttng-ust/tracepoint.c
> @@ -53,6 +53,22 @@ struct {
> /* Set to 1 to enable tracepoint debug output */
> static const int tracepoint_debug;
> static int initialized;
> +
> +/*
> + * Programs that have threads that survive after they exit, and
> + * therefore call library destructors, should disable the tracepoint
> + * destructors by calling tracepoints_set_destructors_disabled(). This will
> + * leak the tracepoint instrumentation library shared object, leaving its
> + * teardown to the operating system process teardown.
> + *
> + * Set to 1 to disable the execution of tracepoint library destructors.
> + *
> + * To access and/or modify this value, users need to use a combination of
> + * dlopen(3) and dlsym(3) to get an handle on the
> + * tracepoint_set_destructors_disabled and tracepoint_get_destructors_disabled
> + * symbols below.
> + */
> +static int tracepoint_destructors_disabled;

This would become:

static int tracepoint_destructors_refcount = 1;

And accessed through uatomic_*() APIs. I would use uatomic
rather than liburcu refcount API because we don't want to run
anything when the refcount reaches 0.

Thanks,

Mathieu

> static void (*new_tracepoint_cb)(struct lttng_ust_tracepoint *);
> 
> /*
> @@ -166,6 +182,16 @@ static void debug_print_probes(struct tracepoint_entry
> *entry)
> 		DBG("Probe %d : %p", i, entry->probes[i].func);
> }
> 
> +void tracepoint_set_destructors_disabled(int is_disabled)
> +{
> +	tracepoint_destructors_disabled = is_disabled;
> +}
> +
> +int tracepoint_get_destructors_disabled(void)
> +{
> +	return tracepoint_destructors_disabled;
> +}
> +
> static void *
> tracepoint_entry_add_probe(struct tracepoint_entry *entry,
> 			   void (*probe)(void), void *data)
> --
> 2.7.4

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


More information about the lttng-dev mailing list