[lttng-dev] [rp] [RFC] Readiness for URCU release with RCU lock-free hash table

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Mon May 7 12:10:55 EDT 2012

* Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> On Fri, May 04, 2012 at 12:53:12PM -0400, Mathieu Desnoyers wrote:
> Just to make sure I understand -- the reason that the "del" functions
> say "no memory barrier" instead of "acts like rcu_dereference()" is
> that the "del" functions don't return anything.
> > @@ -391,6 +413,7 @@ int cds_lfht_del(struct cds_lfht *ht, struct cds_lfht_node *node);
> >   * function.
> >   * Call with rcu_read_lock held.
> >   * Threads calling this API need to be registered RCU read-side threads.
> > + * This function does not issue any memory barrier.
> >   */

One more question about the "del" memory ordering semantic. Following

commit db00ccc36e7fb04ce8044fb1be7964acd1de6ae0
Author: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
Date:   Mon Dec 19 16:45:51 2011 -0500

    rculfhash: Relax atomicity guarantees required by removal operation

    The atomicity guarantees for the removal operation do not need to be as
    strict as a cmpxchg. Use a uatomic_set followed by a xchg on a newly
    introduced "REMOVAL_OWNER_FLAG" to perform removal.

    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>

The "del" operation is performed in two steps:

1 - uatomic_or(), which sets the "REMOVED" flag (the actual tombstone)
    unconditionally into the node's next pointer.
2 - uatomic_xchg(), which atomically exchanges the old pointer with
    its current value (read) or'd with the REMOVAL_OWNER flag. The trick
    is that if the xchg returns a pointer with the REMOVAL_OWNER flag
    set, it means we are not the first thread to set this flag, so we
    should not free the node. However, if xchg returns a node without
    the REMOVAL_OWNER flag set, we are indeed the first to set it, so
    we should call free.

Now regarding memory ordering semantics, should we consider the atomic
action of "del" to apply when the "or" is called, or when the "xchg" is
called ? Or should we simply document that the "del" effect on the node
happens in two separate steps ?

The way I see it, the actual effect of removal, as seen from RCU read
traversal and lookup point of view, is observed as soon as the "REMOVED"
tombstone is set, so I would think that the atomic publication of the
removal is performed by the "or".

However, we ensure full memory barriers around "xchg", but not usually 
around "or". Therefore, the current implementation does not issue a 
memory barrier before the "or", so we should either change our planned
memory barrier documentation, or the implementation, to match. This
would probably require creation of a cmm_smp_mb__before_uatomic_or(), so
x86 does not end up issuing a useless memory barrer.

Thoughts ?



Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.

More information about the lttng-dev mailing list