[ltt-dev] [rp] [RFC] URCU concurrent data structure API
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Wed Aug 17 17:23:52 EDT 2011
On Wed, Aug 17, 2011 at 12:40:39PM -0400, Mathieu Desnoyers wrote:
> Hi,
>
> I'm currently trying to find a good way to present the cds_ data
> structure APIs within URCU for data structures depending on RCU for
> their synchronization. The main problem is that we have many flavors of
> rcu_read_lock/unlock and call_rcu to deal with.
>
> Various approaches are possible:
>
> 1) The current approach: require that the callers pass call_rcu as
> parameter to data structure init functions, and require that the
> callers hold rcu_read_lock across API invocation.
>
> downsides: holds rcu read lock across busy-waiting loops (for longer
> than actually needed). Passing call_rcu as parameter and putting
> requirements on the lock held when calling the API complexify the API,
> and makes it impossible to inline call_rcu invocations.
The function-call overhead for call_rcu() should not be a big deal.
I am not all that concerned about an RCU read-side critical section
covering the busy waiting -- my guess is that the busy waiting itself
would become a problem long before the overly long RCU read-side
critical section becomes a problem.
> 2) Require all callers to pass call_rcu *and* rcu_read_lock/unlock as
> parameter to data structure init function.
>
> downsides: both call_rcu and read lock/unlock become function calls
> (slower). Complexify the API.
>
> 3) Don't require caller to pass anything rcu-related to data structure
> init. Would require to compile one instance of each data structure
> per RCU flavor shared object (like we're doing with call_rcu now).
>
> Downside: we would need to ship per-rcu-flavor version of each data
> structure.
>
> Upside: simple API, no rcu read-side lock around busy-waiting loops,
> ability to inline both call_rcu and rcu_read_lock/unlock within the
> data structure handling code.
If we do #3, it is best to make sure that different library functions
making different RCU-flavor choices can be linked into a single program.
More preprocessor trick, I guess...
> There are probably others, but I think it gives an idea of the main
> scenarios I consider. I start to like (3) more and more, and I'm tempted
> to move to it, but I would really like feedback on this API matter
> before I take any decision.
Actually, #1 seems simpler than #3 and with few major downsides.
I can imagine the following additional possibilities:
4) Create a table mapping from the call_rcu() variant to the
corresponding rcu_read_lock() and rcu_read_unlock() variants. If busy
waiting is required, look up the rcu_read_lock() and rcu_read_unlock()
variants and then call them. (If you are busy waiting anyway,
who cares about a bit of extra overhead?)
I don't see this as a reasonable initial alternative, but it would
be a decent place to migrate to if starting from #1 and the long
RCU read-side critical sections did somehow turn out to be a problem.
5) Like #4, but map to a function that does rcu_read_unlock() followed
immediately by rcu_read_lock().
6) Like #1, but make the caller deal with deallocation. Then the caller
gets to select which of the call_rcu() variants should be used.
(Yes, there might be a reason why this is a bad idea, I do need to
go review the implementations.)
If #6 is impractical, I still like #1 with either #4 or #5 as fallbacks.
So, what am I missing?
Thanx, Paul
More information about the lttng-dev
mailing list