[lttng-dev] [RFC PATCH v2 06/13] Fix: unregister app notify socket on sessiond tear down

Jonathan Rajotte jonathan.rajotte-julien at efficios.com
Mon Sep 18 22:51:59 UTC 2017


A race between the sessiond tear down and applications initialization
can lead to a deadlock.

Applications try to communicate via the notify sockets while sessiond
does not listen anymore on these sockets since the thread responsible
for reception/response is terminated (ust_thread_manage_notify). These
sockets are never closed hence an application could hang on
communication.

Sessiond hang happen during call to cmd_destroy_session during
sessiond_cleanup. Sessiond is trying to communicate with the app while
the app is waiting for a response on the app notification socket.

To prevent this situation a call to ust_app_notify_sock_unregister is
performed on all entry of the ust_app_ht_by_notify_sock hash table at
the time of termination. This ensure that any pending communication
initiated by the application will be terminated since all sockets will
be closed at the end of the grace period via call_rcu inside
ust_app_notify_sock_unregister. The use of ust_app_ht_by_notify_sock
instead of the ust_app_ht prevent a double call_rcu since entries are
removed from ust_app_ht_by_notify_sock during ust_app_notify_sock_unregister.

This can be reproduced using the sessiond_teardown_active_session
scenario provided by [1].

[1] https://github.com/PSRCode/lttng-stress

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien at efficios.com>
---
 src/bin/lttng-sessiond/ust-thread.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/bin/lttng-sessiond/ust-thread.c b/src/bin/lttng-sessiond/ust-thread.c
index 1e7a8229..8f11133a 100644
--- a/src/bin/lttng-sessiond/ust-thread.c
+++ b/src/bin/lttng-sessiond/ust-thread.c
@@ -27,6 +27,19 @@
 #include "health-sessiond.h"
 #include "testpoint.h"
 
+
+static
+void notify_sock_unregister_all()
+{
+	struct lttng_ht_iter iter;
+	struct ust_app *app;
+	rcu_read_lock();
+	cds_lfht_for_each_entry(ust_app_ht_by_notify_sock->ht, &iter.iter, app, notify_sock_n.node) {
+		ust_app_notify_sock_unregister(app->notify_sock);
+	}
+	rcu_read_unlock();
+}
+
 /*
  * This thread manage application notify communication.
  */
@@ -53,7 +66,7 @@ void *ust_thread_manage_notify(void *data)
 
 	ret = lttng_poll_create(&events, 2, LTTNG_CLOEXEC);
 	if (ret < 0) {
-		goto error;
+		goto error_poll_create;
 	}
 
 	/* Add quit pipe */
@@ -197,6 +210,8 @@ error_poll_create:
 error_testpoint:
 	utils_close_pipe(apps_cmd_notify_pipe);
 	apps_cmd_notify_pipe[0] = apps_cmd_notify_pipe[1] = -1;
+	notify_sock_unregister_all();
+
 	DBG("Application notify communication apps thread cleanup complete");
 	if (err) {
 		health_error();
-- 
2.11.0



More information about the lttng-dev mailing list