[lttng-dev] URCU and pthread mutex/cond variable hang

Jeff Layton jlayton at poochiereds.net
Tue Mar 10 15:25:39 EDT 2015


On Tue, 10 Mar 2015 00:43:09 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:

> ----- Original Message -----
> > From: "Jeff Layton" <jlayton at poochiereds.net>
> > To: lttng-dev at lists.lttng.org
> > Sent: Tuesday, March 3, 2015 3:40:14 PM
> > Subject: [lttng-dev] URCU and pthread mutex/cond variable hang
> > 
> > I've been using urcu to develop some userland code, and I've run into a
> > problem that I don't quite understand. If I register a thread and then
> > have that thread block on a pthread condition variable, then it seems
> > to cause synchronize_rcu in another thread to hang.
> > 
> > I've attached a testcase that demonstrates the problem. You have to
> > build it and link it against -lpthread and -lurcu-qsbr:
> > 
> >     $ gcc -Wall -o ./urcu_hang -lpthread -lurcu-qsbr urcu_hang.c
> > 
> > ...it will run fine. If you comment out the rcu_thread_offline/online
> > calls though, it will hang.
> > 
> > Why? Is this expected behavior or a bug in urcu?
> 
> This is because you are using the urcu QSBR flavor. For this
> flavor, the default state of a registered thread is to be
> within a RCU read-side critical section, thus to block
> synchronize_rcu() until the next rcu_quiescent_state() call
> or until the next extended quiescent state (thread offline).
> 

(facepalm)

Ahh ok. I totally missed that bit, but it makes sense now that you've
pointed it out. So does that mean that rcu_read_lock/unlock are no-ops
with QSBR?

> Therefore, you need to take care to put the threads in
> offline mode before you issue blocking operations that
> depend on completion of synchronize_rcu() to proceed further,
> which is the case of test_rcu() in your test program. Otherwise
> you create a deadlock, where main() is in a RCU read-side critical
> section while it blocks awaiting on the pthread cond var.
> Unfortuntately, the test_rcu() thread is unable to issue
> the pthread cond signal, because it is blocked on synchronize_rcu(),
> because main is itself in a RCU critical section.
> 

Yep, I've started doing that and everything is working well, but I'm a
little worried that I could eventually end up missing someplace and hang
everything. :)

> Another alternative would be to use the other URCU flavors such
> as urcu-mb, urcu-signal or urcu-membarrier. Their default state
> is to be in a RCU quiescent state, which is IMHO more intuitive
> for the users. But QSBR is the fastest flavor, but it comes at
> the expense of a somewhat more complex API.
> 

Yes, I may consider doing that instead.

> Hoping this explanation helps,
> 

It does -- many thanks!

-- 
Jeff Layton <jlayton at poochiereds.net>



More information about the lttng-dev mailing list