[ltt-dev] [PATCH RFC urcu] add pthread_atfork interfaces for call_rcu

Mon Mar 14 21:30:20 EDT 2011

On Mon, Mar 14, 2011 at 02:35:03PM -0400, Mathieu Desnoyers wrote:
> * Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> > Hello!
> > 
> > Please see below for an RFC patch for creating pthread_atfork()-friendly
> > interfaces for call_rcu().
> > 
> > However, I ran into a problem when trying to test this.  I cannot use
> > this except with rcu-bp, since only rcu-bp can be made fork()-safe.
> > OK, easy enough to create the corresponding interfaces for the other
> > flavors of RCU.  But this led to a couple of issues:
> > 
> > 1.	Do we intend to support mixing multiple flavors of RCU in the
> > 	same program?  Trying this with the current implementations
> > 	results in symbol collisions (multiple different versions
> > 	of synchronize_rcu(), for example).  It is possible to use
> > 	preprocessor magic to handle this, but thought I should check
> > 	on the intent before attempting this.
> 
> In the case of URCU bp, I indend to statically link UST to it. But I
> agree it's a hack. Would you rather prefix all symbols with names
> specific to their own flavor ? It makes it slightly harder to change
> flavors in an application, although it lessens the risk of bugs due to
> linking with the from urcu flavor (mismatch between headers and urcu
> library).

I would rather that (for example) synchronize_rcu() be a cpp macro that
expands into a specific name based on the .h file.  This way the different
flavors can be linked together into a single application without symbol
clashes, but the only change required to move from one RCU flavor to
another is to change the #include directive.

Seem reasonable?

> > 2.	Do we intend to support use of RCU from library functions?
> > 	If the answer is "yes", then we are pretty much stuck supporting
> > 	multiple flavors of RCU in the same program, because different
> > 	library functions might reasonably choose different of the
> > 	RCU implementations.
> 
> Indeed, although we might want to wonder if we want to allow only the
> static linking case from libraries or do we want to support dynamic
> linking too ?

If we can support both easily, it will make userspace-rcu significantly
easier to use.

> > 3.	Do we intend to support use of UST on programs that use
> > 	userspace-rcu?  If so, we again might need to tolerate mixing
> > 	of RCU flavors, given that UST uses rcu-bp and the programs
> > 	in questions might reasonably make a different choice.
> 
> Yes, although the static vs dynamic linking question remains.

Ditto.

> > 4.	Do we intend to support use of a given flavor of RCU from
> > 	multiple library modules, such that a given program might
> > 	link any number of the RCU-using library modules?  If so,
> > 	we need initialization to be idempotent.
> 
> Actually, we'd need to refcount the library here, right ?

I don't yet see why.  What use case are you thinking of?

> > 5.	Do we intend to support use of a given flavor of RCU from
> > 	multiple library modules, some of which might fork() and others
> > 	of which might not?  If so, we need to pull the handling of
> > 	pthread_atfork() into the userspace-rcu library -- otherwise,
> > 	multiple users of RCU might call pthread_atfork(), resulting
> > 	in self-deadlock at fork() time.
> > 
> > 	Perhaps a flag on rcu_register_thread() indicating that the
> > 	thread registering itself might fork()?  Or, for compatibility,
> > 	a new rcu_register_thread_flags() that has a bitfield with
> > 	options.  The first thread that indicated that it might fork
> > 	would then cause userspace-rcu to do the required calls to
> > 	pthread_atfork().
> 
> Looks like a good idea. In the case of urcu-bp (which might eventually
> use call rcu), I would like to leave the pre/post fork symbols available
> too, so the register thread could specify, for these, that the pre/post
> fork hooks should *not* be automatically invoked, leaving this to the
> caller.
> 
> So the standard behavior for all urcu flavors, urcu defer and urcu
> call_rcu would be not to call any of these callbacks.

Right, unless someone asks for the ability to fork() with the child
able to use userspace RCU, none of the callback stuff happens.
The callbacks are available for sufficiently masochistic experts
to use them manually.  ;-)

> If the application/user library needs to support fork()s, it would call
> rcu_register_thread_flags() (and the similar functions for urcu defer
> and call_rcu()) to indicate that forks must be supported, which will
> register pthread atfork handlers (this should be refcounted).
> 
> In the case of urcu-bp, it would provide the callbacks that can be used
> to deal with forks. We can keep track of which callbacks have been
> invoked during execution with a counter incremented as pre-fork,
> decremented post-fork, so if we have two "clients" calling pre fork
> hooks, only the first one will be executed, the second will be skipped.
> The post fork hooks would decrement the counter.

This urcu-bp processing is due to the fact that the first call to
the API does the initialization, thus not having a way to specify
fork safety, correct?

> call_rcu should provide both the symbols of pre/post fork hooks (with
> the counter scheme, like urcu-bp), and also the ability to register with
> thread_flags, so pthread_atfork would be used to handle forks(). Note
> that in a combination where we have some "clients" using the
> pthread_atfork() scheme, and others using direct calls to the pre/post
> fork symbols, the counter scheme would ensure that the hooks only get
> called once. The execution order of the UST fork symbol override is
> that the UST code executes first before the fork, and last after the
> fork, and the original libc symbol is called in the middle (which will
> trigger atfork).

With the possible exception of rcu-bp, I do not believe that we should
support mix-and-match pthread_atfork() and direct callback invocation.
Use of direct callback invocation should be limited to experts who
do whatever is required to avoid the issues -- for example, statically
linking, full control of the application, and so on.

If we don't support mix-and-match initially, we can always add formal
support later, right?  That would allow experimentation in the meantime.

> Does what I'm proposing here make sense ?

With the exception of mix-and-match, yep!

							Thanx, Paul

> > I suspect that the answer is "yes" for all of the above questions.
> > 
> > Does this all make sense, or am I confused?
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > Provide pthread_atfork-friendly interfaces
> > 
> > Provides call_rcu_before_fork() and call_rcu_after_fork_parent() to
> > go with the existing call_rcu_after_fork_child().
> 
> For the patch below, it might be better to also implement the flag
> scheme using atfork in addition to exporting the symbol, and to add the
> counter scheme to keep track of multiple calls.
> 
> > 
> > Signed-off-by: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
> > 
> > diff --git a/README b/README
> > index f7f0dec..56e98d7 100644
> > --- a/README
> > +++ b/README
> > @@ -204,3 +204,7 @@ Interaction with fork()
> >  	liburcu-bp, which is designed to handle fork() by calling
> >  	rcu_bp_before_fork, rcu_bp_after_fork_parent and
> >  	rcu_bp_after_fork_child.
> > +
> > +	Applications that use call_rcu() are required to invoke
> > +	call_rcu_after_fork_child() from the child process after a
> > +	successful fork() system call that is not followed by exec().
> 
> Not true for the atfork scheme.
> 
> > diff --git a/urcu-call-rcu.c b/urcu-call-rcu.c
> > index bb56dbb..665f20c 100644
> > --- a/urcu-call-rcu.c
> > +++ b/urcu-call-rcu.c
> > @@ -566,13 +566,40 @@ void free_all_cpu_call_rcu_data(void)
> >  }
> >  
> >  /*
> > + * Acquire the call_rcu_mutex in order to ensure that the child sees
> > + * all of the call_rcu() data structures in a consistent state.
> > + * Suitable for pthread_atfork() and friends.
> > + */
> > +void call_rcu_before_fork(void)
> > +{
> > +	call_rcu_lock(&call_rcu_mutex);
> > +}
> > +
> > +/*
> > + * Clean up call_rcu data structures in the parent of a successful fork()
> > + * that is not followed by exec() in the child.  Suitable for
> > + * pthread_atfork() and friends.
> > + */
> > +void call_rcu_after_fork_parent(void)
> > +{
> > +	call_rcu_unlock(&call_rcu_mutex);
> > +}
> > +
> > +/*
> >   * Clean up call_rcu data structures in the child of a successful fork()
> > - * that is not followed by exec().
> > + * that is not followed by exec().  Suitable for pthread_atfork() and
> > + * friends.
> >   */
> >  void call_rcu_after_fork_child(void)
> >  {
> >  	struct call_rcu_data *crdp;
> >  
> > +	/* Re-initialize the mutex. */
> > +	if (pthread_mutex_init(&call_rcu_mutex, NULL) != 0) {
> 
> Silly question: why reinitialize the mutex rather than unlock it ?
> 
> Thanks,
> 
> Mathieu
> 
> > +		perror("pthread_mutex_init");
> > +		exit(-1);
> > +	}
> > +
> >  	/*
> >  	 * Allocate a new default call_rcu_data structure in order
> >  	 * to get a working call_rcu thread to go with it.
> > 
> > _______________________________________________
> > ltt-dev mailing list
> > ltt-dev at lists.casi.polymtl.ca
> > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > 
> 
> -- 
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com