[lttng-dev] User-space RCU: call rcu_barrier() before dissociating helper thread?

Wed May 5 14:07:38 EDT 2021

On Wed, May 05, 2021 at 10:46:58AM -0400, Mathieu Desnoyers wrote:
> ----- On May 5, 2021, at 3:54 AM, Martin Wilck mwilck at suse.com wrote:
> 
> > On Fri, 2021-04-30 at 14:41 -0400, Mathieu Desnoyers wrote:
> >> ----- On Apr 29, 2021, at 9:49 AM, lttng-dev
> >> lttng-dev at lists.lttng.org wrote:
> >> 
> >> > In multipath-tools, we are using a custom RCU helper thread, which
> >> > is cleaned
> >> > out
> >> > on exit:
> >> > 
> >> > https://github.com/opensvc/multipath-tools/blob/23a01fa679481ff1144139222fbd2c4c863b78f8/multipathd/main.c#L3058
> >> > 
> >> > I put a call to rcu_barrier() there in order to make sure all
> >> > callbacks had
> >> > finished
> >> > before detaching the helper thread.
> >> > 
> >> > Now we got a report that rcu_barrier() isn't available before user-
> >> > space RCU 0.8
> >> > (https://github.com/opensvc/multipath-tools/issues/5) (and RHEL7 /
> >> > Centos7
> >> > still has 0.7.16).
> >> > 
> >> > Question: was it over-cautious or otherwise wrong to call
> >> > rcu_barrier() before
> >> > set_thread_call_rcu_data(NULL)? Can we maybe just skip this call?
> >> > If no, what
> >> > would be the recommended way for liburcu < 0.8 to dissociate a
> >> > helper thread?
> >> > 
> >> > (Note: I'm not currently subscribed to lttng-dev).
> >> 
> >> First of all, there is a significant reason why liburcu does not free
> >> the "default"
> >> call_rcu worker thread data structures at process exit. This is
> >> caused by the fact that
> >> a call_rcu callback may very well invoke call_rcu() to re-enqueue
> >> more work.
> >> 
> >> AFAIU this is somewhat similar to what happens to the Linux kernel
> >> RCU implementation
> >> when the machine needs to be shutdown or rebooted: there may indeed
> >> never be any point
> >> in time where it is safe to free the call_rcu worker thread data
> >> structures without leaks,
> >> due to the fact that a call_rcu callback may re-enqueue further work
> >> indefinitely.
> >> 
> >> So my understanding is that you implement your own call rcu worker
> >> thread because the
> >> one provided by liburcu leaks data structure on process exit, and you
> >> expect that
> >> call rcu_barrier once will suffice to ensure quiescence of the call
> >> rcu worker thread
> >> data structures. Unfortunately, this does not cover the scenario
> >> where a call_rcu
> >> callback re-enqueues additional work.
> > 
> > I understand. In multipath-tools, we only have one callback, which
> > doesn't re-enqueue any work. Our callback really just calls free() on a
> > data structure. And it's unlikely that we'll get more RCU callbacks any
> > time soon.
> > 
> > So, to clarify my question: Does it make sense to call rcu_barrier()
> > before set_thread_call_rcu_data(NULL) in this case?
> 
> Yes, it would ensure that all pending callbacks are executed prior to
> removing the worker thread. And considering that you don't have chained
> callbacks, it makes sense to invoke rcu_barrier() only once.

If you do have chained callbacks, one trick is to:

1.	Prevent your application from doing any more new invocations
	of call_rcu().

2.	Set a flag that prevents any future callbacks from chaining.

3.	Do two calls to rcu_barrier(), one to wait for pre-existing
	callbacks and another to wait for any additional chained
	callbacks that happened concurrently with #2 above.

							Thanx, Paul