[lttng-dev] Alternative to signals/sys_membarrier() in liburcu
mathieu.desnoyers at efficios.com
Thu Mar 12 18:30:35 EDT 2015
----- Original Message -----
> From: "Linus Torvalds" <torvalds at linux-foundation.org>
> To: "Mathieu Desnoyers" <mathieu.desnoyers at efficios.com>
> Cc: "Michael Sullivan" <sully at msully.net>, lttng-dev at lists.lttng.org, "LKML" <linux-kernel at vger.kernel.org>, "Paul E.
> McKenney" <paulmck at linux.vnet.ibm.com>, "Peter Zijlstra" <peterz at infradead.org>, "Ingo Molnar" <mingo at kernel.org>,
> "Thomas Gleixner" <tglx at linutronix.de>, "Steven Rostedt" <rostedt at goodmis.org>
> Sent: Thursday, March 12, 2015 5:47:05 PM
> Subject: Re: Alternative to signals/sys_membarrier() in liburcu
> On Thu, Mar 12, 2015 at 1:53 PM, Mathieu Desnoyers
> <mathieu.desnoyers at efficios.com> wrote:
> > So the question as it stands appears to be: would you be comfortable
> > having users abuse mprotect(), relying on its side-effect of issuing
> > a smp_mb() on each targeted CPU for the TLB shootdown, as
> > an effective implementation of process-wide memory barrier ?
> Be *very* careful.
> Just yesterday, in another thread (discussing the auto-numa TLB
> performance regression), we were discussing skipping the TLB
> invalidates entirely if the mprotect relaxes the protections.
> Because if you *used* to be read-only, and them mprotect() something
> so that it is read-write, there really is no need to send a TLB
> invalidate, at least on x86. You can just change the page tables, and
> *if* any entries are stale in the TLB they'll take a microfault on
> access and then just reload the TLB.
> So mprotect() to a more permissive mode is not necessarily serializing.
The idea here is to always mprotect() to a more restrictive mode,
which should trigger the TLB shootdown.
> Also, you need to make sure that your page is actually in memory,
> because otherwise the kernel may end up seeing "oh, it's not even
> present", and never flush the TLB at all.
> So now you need to mlock that page. Which can be problematic for non-root.
I'm aware the default amount of locked memory is usually quite low
(64kB here). So we'd need to handle cases where we run out of locked
memory. We could fallback to a slower userspace RCU scheme if this
> In other words, I'd be a bit leery about it. There may be other
> gotcha's about it.
Looking again at this old proposed patch (https://lkml.org/lkml/2010/4/18/15)
which adds a few memory barriers around updates to mm_cpumask
for sys_membarrier makes me wonder whether mprotect() may not skip
some CPU from the mask that would actually need to be taken care of
in very narrow race scenarios.
More information about the lttng-dev