[lttng-dev] [tip:timers/urgent] timekeeping: Fix HRTICK related deadlock from ntp lock changes
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Tue Sep 17 12:33:03 EDT 2013
* Ingo Molnar (mingo at kernel.org) wrote:
>
> * Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
>
> > * Ingo Molnar (mingo at kernel.org) wrote:
> > >
> > > * Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
> > >
> > > > Hi Ingo,
> > > >
> > > > Do you have an estimate of the time it will take for this fix to hit
> > > > mainline, stable-3.10 and stable-3.11 ? Meanwhile, I'm marking 3.10 and
> > > > 3.11 as broken for LTTng with a kernel version at compile-time, since
> > > > this kernel regression currently triggers hard system lockup when people
> > > > use LTTng on those kernels, and this is certainly something nobody
> > > > wants.
> > >
> > > So, at least as per the description of John, this should only trigger if
> > > SCHED_HRTICK is enabled in sched_features - which is disabled by default,
> > > it's a debug-only development feature. Does the bug trigger on more
> > > regular kernels as well?
> >
> > Unfortunately, it does happen on a pretty standard kernel config (giving
> > my x230 config as example below). Pasting relevant bug description from
> > http://bugs.lttng.org/issues/631 :
> >
> > "Starting from Linux kernel commit
> > 06c017fdd4dc48451a29ac37fc1db4a3f86b7f40 "timekeeping: Hold
> > timekeepering locks in do_adjtimex and hardpps" (3.10 kernels), the
> > xtime write seqlock is held across calls to __do_adjtimex(), which
> > includes a call to notify_cmos_timer(), and hence
> > schedule_delayed_work().
> >
> > This introduces a side-effect for a set of tracepoints, including mainly
> > the workqueue tracepoints: a tracer hooking on those tracepoints and
> > reading current time with ktime_get() will cause hard system LOCKUP"
>
> It's the LTTng tracepoint 'hooking' in something that does something
> invalid in that context that is causing the hang, not the vanilla kernel
> itself, right?
Yes, that's correct. In order to ensure this kind of problem is entirely
taken care of, I've started working on a synchronization scheme proposed
by Peter Zijlstra that would allow ktime() to be called from any
execution context (see:
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg504089.html).
>
> In that case the 'you get to keep both pieces' policy of out of tree code
> applies - but the HRTICK fix should solve your problem as well,
> incidentally.
Thanks,
Mathieu
>
> Thanks,
>
> Ingo
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list