[ltt-dev] Cannot destroy trace

Mathieu Desnoyers compudj at krystal.dyndns.org
Tue May 11 08:00:11 EDT 2010


* jerome zh (jeromezhr at gmail.com) wrote:
> Hi all,
> 
> I have modified the original lttng kernel patch to fit my RT-linux patched
> 2.6.30 kernel. And IMHO the kernel now works fine.
> Everything seems OK until I run "lttctl -D trace1", the process became
> "blocking like"(I am not sure if it is blocked).
> The last msg printed on the screen is "lttctl: Destroying trace".
> Then I added some debug message into the liblttctl.c file. As a result, the
> process was "blocked" while executing
> *write(fd, op, strlen(op))* in function *lttctl_sendop()*.
> 
> Any advice? Thanks in advance.

Can you try with the following patch ?


lttng fix rt kernel teardown deadlock

LTTng has a teardown bug on RT (deadlock):

Deleting a timer (sync) while holding the traces mutex, and the handler takes
the same mutex, which leads to a deadlock.

Fix this by taking a RCU read lock in the timer (instead of the RT-specific fix
using the mutex), and by doing synchronize_rcu() in addition to
synchronize_sched() upon updates.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
---
 ltt/ltt-tracer.c |   28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

Index: linux-2.6-lttng/ltt/ltt-tracer.c
===================================================================
--- linux-2.6-lttng.orig/ltt/ltt-tracer.c	2010-05-11 07:50:46.000000000 -0400
+++ linux-2.6-lttng/ltt/ltt-tracer.c	2010-05-11 07:55:46.000000000 -0400
@@ -41,6 +41,14 @@
 #include <linux/vmalloc.h>
 #include <asm/atomic.h>
 
+static void synchronize_trace(void)
+{
+	synchronize_sched();
+#ifdef CONFIG_PREEMPT_RT
+	synchronize_rcu();
+#endif
+}
+
 static void async_wakeup(unsigned long data);
 
 static DEFINE_TIMER(ltt_async_wakeup_timer, async_wakeup, 0, 0);
@@ -321,7 +329,7 @@ void ltt_module_unregister(enum ltt_modu
 		ltt_filter_unregister();
 		ltt_run_filter_owner = NULL;
 		/* Wait for preempt sections to finish */
-		synchronize_sched();
+		synchronize_trace();
 		break;
 	case LTT_FUNCTION_FILTER_CONTROL:
 		ltt_filter_control_functor = ltt_filter_control_default;
@@ -429,13 +437,13 @@ static void async_wakeup(unsigned long d
 	 * PREEMPT_RT does not allow spinlocks to be taken within preempt
 	 * disable sections (spinlock taken in wake_up). However, mainline won't
 	 * allow mutex to be taken in interrupt context. Ugly.
-	 * A proper way to do this would be to turn the timer into a
-	 * periodically woken up thread, but it adds to the footprint.
+	 * Take a standard RCU read lock for RT kernels, which imply that we
+	 * also have to synchronize_rcu() upon updates.
 	 */
 #ifndef CONFIG_PREEMPT_RT
 	rcu_read_lock_sched();
 #else
-	ltt_lock_traces();
+	rcu_read_lock();
 #endif
 	list_for_each_entry_rcu(trace, &ltt_traces.head, list) {
 		trace_async_wakeup(trace);
@@ -443,7 +451,7 @@ static void async_wakeup(unsigned long d
 #ifndef CONFIG_PREEMPT_RT
 	rcu_read_unlock_sched();
 #else
-	ltt_unlock_traces();
+	rcu_read_unlock();
 #endif
 
 	mod_timer(&ltt_async_wakeup_timer, jiffies + LTT_PERCPU_TIMER_INTERVAL);
@@ -901,7 +909,7 @@ int ltt_trace_alloc(const char *trace_na
 		set_kernel_trace_flag_all_tasks();
 	}
 	list_add_rcu(&trace->list, &ltt_traces.head);
-	synchronize_sched();
+	synchronize_trace();
 
 	ltt_unlock_traces();
 
@@ -974,7 +982,7 @@ static int _ltt_trace_destroy(struct ltt
 	}
 	/* Everything went fine */
 	list_del_rcu(&trace->list);
-	synchronize_sched();
+	synchronize_trace();
 	if (list_empty(&ltt_traces.head)) {
 		clear_kernel_trace_flag_all_tasks();
 		/*
@@ -1195,7 +1203,7 @@ static int _ltt_trace_stop(struct ltt_tr
 			trace->nr_channels);
 		trace->active = 0;
 		ltt_traces.num_active_traces--;
-		synchronize_sched(); /* Wait for each tracing to be finished */
+		synchronize_trace(); /* Wait for each tracing to be finished */
 	}
 	module_put(ltt_run_filter_owner);
 	/* Everything went fine */
@@ -1327,12 +1335,12 @@ static void __exit ltt_exit(void)
 	list_for_each_entry_rcu(trace, &ltt_traces.head, list)
 		_ltt_trace_stop(trace);
 	/* Wait for quiescent state. Readers have preemption disabled. */
-	synchronize_sched();
+	synchronize_trace();
 	/* Safe iteration is now permitted. It does not have to be RCU-safe
 	 * because no readers are left. */
 	list_for_each_safe(pos, n, &ltt_traces.head) {
 		trace = container_of(pos, struct ltt_trace, list);
-		/* _ltt_trace_destroy does a synchronize_sched() */
+		/* _ltt_trace_destroy does a synchronize_trace() */
 		_ltt_trace_destroy(trace);
 		__ltt_trace_destroy(trace);
 	}


> 
> -- 
> *regards,
> Jerome*

> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list