[lttng-dev] my user space rcu code

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Thu Feb 7 11:20:51 EST 2013


* 赵宇龙 (zylthinking at gmail.com) wrote:
> Hi,
> 
> I write a user space rcu, code at
> https://github.com/zylthinking/tools/blob/master/rcu.h
> https://github.com/zylthinking/tools/blob/master/rcu.c.
> 
> I notice the main difference  with liburcu should be I use a daemon thread
> to do works such as waking up sleepers, while the liburcu does not.
> 
> I wonder why we can't use such a thread. When we use it, the
> rcu_read_(un)lock will not includes wake up writers any more. which will
> help to improve performance. It is the cost of such a daemon thread is too
> high?
> 
> zhao yulong

Hi Zhao,

The main reason for having the "wakeup writer" present in the
rcu_read_unlock() path in liburcu is to be energy-efficient: we don't
want any thread to consume power unless they really have to. For
instance, endless busy-waiting on a variable is avoided.

This is why we rely on sys_futex to wake up awaiting writers from
rcu_read_unlock(). AFAIU, your rcu_daemon() thread is always active, and
even though it calls "sched_yield()" to be nice to others, it will keep
one CPU always powered on.

One important thing to notice is that liburcu rcu_read_unlock() only
calls sys_futex if a writer is waiting. Therefore, in a scenario with
frequent reads and infrequent updates, rcu_read_unlock() only has to
take the performance overhead of a load, test and branch, actually
skipping the futex wake call.

Another reason for not going for a worker thread to handle wait/wakeup
between synchronize_rcu and rcu_read_unlock is to minimize impact on the
application process/threading model. This is especially true for
applications that rely on fork() _not_ followed by exec(): Linux
actually copies a single thread of the parent (the one executing
fork()), and discards all other threads. Therefore, we must be aware
that adding an in-library thread will require users to handle the
fork()-not-followed-by-exec() case carefully. Since we had no other
choice, we rely on worker threads for call_rcu, but we don't use worker
threads for the "simpler" use-case of synchronize_rcu().

A third reason for directly waking up the writer thread rather than
having a worker thread dispatching this information is speed. Given the
power efficiency constraints expressed above, we would have to issue one
system call from the rcu_read_unlock() site to wake up the rcu_daemon()
thread (so it does not have to busy-wait), and another system call to
wake up the writer, involving a third thread in what should really
involve only two threads. This will therefore add overhead to this
signalling by requiring the scheduler to perform one extra context
switch, and may involve extra communication between processors, since
rcu_daemon() will likely execute on a different CPU, and will have to
bring in cache lines from other processors.

Finally, let's discuss the real-time aspect. For RT, we ideally want
wait-free rcu_read_lock/unlock. Indeed, having a sys_futex wakeup call
in rcu_read_unlock() could arguably be seen a making the unlock path
less strictly wait-free (in case you would be concerned about the
internal implementation of sys_futex wake not being entirely wait-free).
Currently, on Linux, one way to change the behavior of rcu_read_unlock()
to make it even more RT-friendly (in case you are concerned about using
sys_futex() on a pure RT thread) is to undefine CONFIG_RCU_HAVE_FUTEX,
thus changing the behavior of urcu/futex.h and compat_futex.c. The
futex_async() call will then do busy-waiting on the FUTEX_WAIT side
(waiting 10us between attempts), and do exactly _nothing_ on the wake-up
side, which is certainly wait-free. This will be less energy-efficient,
of course, but will provide a strictly wait-free rcu_read_unlock().

We might want to consider creating a liburcu-rt.so for real-time
use-cases that prefer the non-energy-efficient wait, along with the
strictly wait-free rcu_read_unlock(). Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list