[ltt-dev] [RELEASE] Userspace RCU 0.3.0

Mathieu Desnoyers mathieu.desnoyers at polymtl.ca
Tue Nov 3 11:53:14 EST 2009

* Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> On Tue, Nov 03, 2009 at 10:02:34AM -0500, Mathieu Desnoyers wrote:
> > Hi everyone,
> > 
> > I released userspace RCU 0.3.0, which includes a small API change for
> > the "deferred work" interface. After discussion with Paul, I decided to
> > drop the support for call_rcu() and only provide defer_rcu(), to make
> > sure I don't provide an API with the same name as the kernel RCU but
> > with different arguments and semantic. It will generate the following
> > linker error if used:
> > 
> > file.c:240: undefined reference to 
> >    `__error_call_rcu_not_implemented_please_use_defer_rcu'
> > 
> > Note that defer_rcu() should *not* be used in RCU read-side C.S.,
> > because it calls synchronize_rcu() if the queue is full. This is a major
> > distinction from call_rcu(). (note to self: eventually we should add
> > some self-check code to detect defer_rcu() nested within RCU read-side
> > C.S.).
> > 
> > I plan to eventually implement a proper call_rcu() within the userspace
> > RCU library. It's not, however, a short-term need for me at the moment.
> I can tell that we need to get you going on some real-time work.  ;-)


> (Sorry, but I really couldn't resist!)

It's true that it becomes important when real-time behavior is required
at the call_rcu() execution site. However, even typical use of
call_rcu() has some limitations in this area: in a scenario where the
struct rcu_head passed to call_rcu() is allocated dynamically, kmalloc
and friends do not offer any kind of wait-free/lock-free guarantees. So
the way call_rcu() works is to push the burden of RT impact on the
original struct rcu_head allocation. But I agree that it makes
out-of-memory/queue full error handling much easier, because all the
allocation is done at the same site.

The main disadvantage of the call_rcu() approach though is that I cannot
see any clean way to perform call_rcu() rate-limitation on a per-cpu
basis. This would basically imply that we have to stop providing RT
call_rcu() at some point to ensure we do not go over a certain

A possible solution would be to make call_rcu() return an error when it
goes over some threshold. The caller would have to deal with the error,
possibly by rejecting the whole operation (so maybe another CPU/cloud
node could take over the work). This seems cleaner than delaying
execution of the call_rcu() site. The caller could actually decide to
either reject the whole operation or to delay its execution.


> 							Thanx, Paul

Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

More information about the lttng-dev mailing list