[lttng-dev] [PATCH liburcu] Fix pthread_atfork() behaviour
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Thu Apr 17 08:43:19 EDT 2014
----- Original Message -----
> From: "Mathieu Desnoyers" <mathieu.desnoyers at efficios.com>
> To: "Keir Fraser" <keir at cohodata.com>
> Cc: lttng-dev at lists.lttng.org, "Paul E. McKenney" <paulmck at linux.vnet.ibm.com>
> Sent: Thursday, April 17, 2014 8:20:00 AM
> Subject: Re: [lttng-dev] [PATCH liburcu] Fix pthread_atfork() behaviour
>
> ----- Original Message -----
> > From: "Keir Fraser" <keir at cohodata.com>
> > To: lttng-dev at lists.lttng.org
> > Sent: Monday, April 14, 2014 9:31:57 AM
> > Subject: [lttng-dev] [PATCH liburcu] Fix pthread_atfork() behaviour
> >
> > In the process of integrating liburcu into a multi-threaded codebase
> > with fork()s I found a couple of problems with liburcu that I could not
> > work around without fixing the library. Hence I present the two required
> > fixes here (as attachments, sorry!) with some background info about them.
>
> This is interesting!
>
> An initial question: Which URCU flavor are you using ?
>
> >
> > After fork() the child process has no pthreads but the one that called
> > fork(). Unfortunately call_rcu_after_fork_child() does not update URCU's
> > thread registry to reflect this -- if fork() is called with any threads
> > registered with URCU then the child process will inherit a corrupted
> > registry containing a linked list through per-thread TLS state which is
> > no longer valid allocated memory.
>
> You appear to be using the mb/memb URCU flavor.
>
> If we look at URCU README file:
>
> Interaction with fork()
>
> Special care must be taken for applications performing fork() without
> any following exec(). This is caused by the fact that Linux only
> clones
> the thread calling fork(), and thus never replicates any of the other
> parent thread into the child process. Most liburcu implementations
> require that all registrations (as reader, defer_rcu and call_rcu
> threads) should be released before a fork() is performed, except for
> the
> rather common scenario where fork() is immediately followed by exec()
> in
> the child process. The only implementation not subject to that rule
> is
> liburcu-bp, which is designed to handle fork() by calling
> rcu_bp_before_fork, rcu_bp_after_fork_parent and
> rcu_bp_after_fork_child.
>
> Applications that use call_rcu() and that fork() without
> doing an immediate exec() must take special action. The parent
> must invoke call_rcu_before_fork() before the fork() and
> call_rcu_after_fork_parent() after the fork(). The child
> process must invoke call_rcu_after_fork_child().
> Even though these three APIs are suitable for passing to
> pthread_atfork(), use of pthread_atfork() is *STRONGLY
> DISCOURAGED* for programs calling the glibc memory allocator
> (malloc(), calloc(), free(), ...) within call_rcu callbacks.
> This is due to limitations in the way glibc memory allocator
> handles calls to the memory allocator from concurrent threads
> while the pthread_atfork() handlers are executing.
> Combining e.g.:
> * call to free() from callbacks executed within call_rcu worker
> threads,
> * executing call_rcu atfork handlers within the glibc pthread
> atfork mechanism,
> will sometimes trigger interesting process hangs. This usually
> hangs on a memory allocator lock within glibc.
>
>
> > Crash or hang soon after is the
> > result. Patch 1 therefore simply clears the registry list in the child
> > process. Caveats here are that (a) the calling thread cannot be
> > registered (it must unregister/re-register itself in the atfork
> > handlers); and (b) some flavours of URCU may have more complex
> > registries than a simple linked list and so this patch may not be
> > sufficient for those. I only tested the memb/mb flavour myself.
>
> Firstly, your use-case seems to be only supported by the urcu-bp
> flavor, and explicitly unsupported by other flavors. If you want
> to do this with other urcu flavors, you need to explicitly
> unregister all your other RCU threads before doing the fork().
>
> Also, if we want to eventually support this in the other flavors,
> we would need to implement the flavor-specific teardown within each
> flavor implementation rather than in the call_rcu implementation.
>
> Moreover, as stated in the README file, call_rcu() pre/post fork
> handlers don't behave well with glibc's pthread_atfork(), due to
> assumptions done within the glibc memory allocator.
>
> >
> > A second problem is that although call_rcu threads are paused across
> > fork(), the handshaking PAUSED flag is not cleared when their execution
> > resumes. Hence a second fork() invocation in the original parent process
> > will not spin-wait for call_rcu threads to quiesce (as the atfork
> > handler will observe all PAUSED flags already set). Patch 2 fixes this
> > with the appropriate clearing handshake on resume, post-fork.
>
> I'd be very interested to see a test-case for this problem against
> the urcu-bp flavor. I think you have indeed caught a real bug here.
>
> >
> > Please feel free to modify or rewrite these patches, or solve the
> > described problems in a different way, as you see fit! Cc me on replies
> > as I am not a subscriber to this list.
>
> Could you provide a small test-case that we could add to the call_rcu
> regression (double fork) under tests/regression to make sure this behavior
> is covered in the future ? A small refactoring of test_urcu_fork.c to
> extend it should do the trick.
I'm currently doing the test_urcu_fork.c refactoring, and testing your
patch for call_rcu and double-fork. (letting you know so we don't
duplicate the effort)
Thanks,
Mathieu
>
> Thanks!
>
> Mathieu
>
> >
> > Regards,
> > Keir Fraser
> >
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev at lists.lttng.org
> > http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> >
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list