[lttng-dev] [rp] [RFC] Readiness for URCU release with RCU lock-free hash table
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Fri May 4 12:53:12 EDT 2012
* Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> On Thu, May 03, 2012 at 01:13:30PM -0400, Mathieu Desnoyers wrote:
> > * Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
[...]
> > >
> > > A write barrier would be sufficient in the case where there were only
> > > two threads observing each other. A full memory barrier would be needed
> > > to prevent the assertion from firing in this sort of case (not sure that
> > > this is exactly right, but something like this):
> > >
> > > Initial contents: B, C
> > >
> > > T0: add A; del B
> > > T1: if (!lookup B) { add B; del C }
> > > T2: r1 = lookup C; smp_mb(); r2 = lookup A
> > >
> > > assert(lookup C || lookup A);
> >
> > What you are bringing here as counter-example is, I think, transitivity.
>
> Yep!
>
> > I'm trying to figure out how your example could fail, and I cannot see
> > how. Follows a detail of the closest scenario I get to failing is the
> > following, but it does not fail. After that, I'm proposing a different
> > scenario, which I think will be more appropriate for the current topic.
> >
> > Attempted detail of your scenario:
> >
> > T0 T1 T2
> >
> > add A
> > wmb
> > (add A globally
> > observable)
>
> Almost. All that this really guarantees is that if someone sees
> the "add B", they will also see the "add A".
>
> > del B
> > (del B globally
> > observable)
> > (add A NOT brought
> > into cache)
> > (del B brought into
> > cache)
> > read B
> > (test false)
> > add B
> > wmb
> > (add B globally
> > observable)
> > del C
> > (del C globally
> > observable)
>
> Here, if someone sees the "del C", they will see the "add B", and
> they also will have lost the opportunity to modify B before T1
> reads from it and modifies it.
>
> > (add A NOT brought
> > into cache)
> > (del C brought into
> > cache)
> > read C -> not there.
> > mb
> > (add A brought
> > into cache)
> > read A -> there -> success.
>
> So we see that C is not there. We know that B would be there if
> we looked at it. But we don't look at B, we look at A. But the
> ordering back to T0's "add A" requires transitivity, which wmb
> does not guarantee.
OK, got it!
>
> > If I look at the "transitivity" section in Linux memory-barriers.txt, I
> > notice that the example is mainly around using read barrier around loads
> > rather than general barrier. Let's see if I can modify that example to
> > come up with an example error case:
> >
> > Initial content: empty
> >
> > T0: add X
> > T1: r1 = lookup X; smp_mb; r2 = lookup Y
> > T2: add Y; r3 = lookup X
> >
> > assert( !(r1 && !r2 && !r3) )
> >
> > The key thing here is that if the barrier in T2 after "add Y" is a
> > smp_wmb rather than a smp_mb, this could allow the "r3 = lookup X" to be
> > reordered before add Y, thus allowing the assertion to fail.
>
> Your example is simpler, and demonstrates the need just as well, so
> let's go with your example.
>
> > I think it would be more intuitive for users if lookups vs updates
> > performed on the same thread are ordered with full memory barriers.
> > Given that we don't want to add extra barriers in read operations, it
> > would make sense to guarantee full memory barriers before and after
> > updates.
> >
> > So how about we use full memory barriers before and after each of: add,
> > del (success), add_unique (success), replace, and add_replace ? If we
> > ever want to relax those ordering guarantees, then we can always add new
> > update APIs with a "weaker" ordering.
> >
> > Thoughts ?
>
> That line of reasoning makes a lot of sense to me!
Sounds good.
Here is what I propose, thoughts ?
diff --git a/urcu/rculfhash.h b/urcu/rculfhash.h
index 2d8a310..2938e5e 100644
--- a/urcu/rculfhash.h
+++ b/urcu/rculfhash.h
@@ -203,6 +203,7 @@ void cds_lfht_count_nodes(struct cds_lfht *ht,
*
* Call with rcu_read_lock held.
* Threads calling this API need to be registered RCU read-side threads.
+ * This function acts as a rcu_dereference() to read the node pointer.
*/
void cds_lfht_lookup(struct cds_lfht *ht, unsigned long hash,
cds_lfht_match_fct match, const void *key,
@@ -226,6 +227,7 @@ void cds_lfht_lookup(struct cds_lfht *ht, unsigned long hash,
* node returned by a previous cds_lfht_next.
* Call with rcu_read_lock held.
* Threads calling this API need to be registered RCU read-side threads.
+ * This function acts as a rcu_dereference() to read the node pointer.
*/
void cds_lfht_next_duplicate(struct cds_lfht *ht,
cds_lfht_match_fct match, const void *key,
@@ -239,6 +241,7 @@ void cds_lfht_next_duplicate(struct cds_lfht *ht,
* Output in "*iter". *iter->node set to NULL if table is empty.
* Call with rcu_read_lock held.
* Threads calling this API need to be registered RCU read-side threads.
+ * This function acts as a rcu_dereference() to read the node pointer.
*/
void cds_lfht_first(struct cds_lfht *ht, struct cds_lfht_iter *iter);
@@ -252,6 +255,7 @@ void cds_lfht_first(struct cds_lfht *ht, struct cds_lfht_iter *iter);
* pointing to the last table node.
* Call with rcu_read_lock held.
* Threads calling this API need to be registered RCU read-side threads.
+ * This function acts as a rcu_dereference() to read the node pointer.
*/
void cds_lfht_next(struct cds_lfht *ht, struct cds_lfht_iter *iter);
@@ -264,6 +268,8 @@ void cds_lfht_next(struct cds_lfht *ht, struct cds_lfht_iter *iter);
* This function supports adding redundant keys into the table.
* Call with rcu_read_lock held.
* Threads calling this API need to be registered RCU read-side threads.
+ * This function issues a full memory barrier before and after its
+ * atomic commit.
*/
void cds_lfht_add(struct cds_lfht *ht, unsigned long hash,
struct cds_lfht_node *node);
@@ -288,6 +294,12 @@ void cds_lfht_add(struct cds_lfht *ht, unsigned long hash,
* to add keys into the table, no duplicated keys should ever be
* observable in the table. The same guarantee apply for combination of
* add_unique and add_replace (see below).
+ *
+ * Upon success, this function issues a full memory barrier before and
+ * after its atomic commit. Upon failure, this function acts like a
+ * simple lookup operation: it acts as a rcu_dereference() to read the
+ * node pointer. The failure case does not guarantee any other memory
+ * barrier.
*/
struct cds_lfht_node *cds_lfht_add_unique(struct cds_lfht *ht,
unsigned long hash,
@@ -321,6 +333,9 @@ struct cds_lfht_node *cds_lfht_add_unique(struct cds_lfht *ht,
* schemes will never generate duplicated keys. It also allows us to
* guarantee that a combination of add_replace and add_unique updates
* will never generate duplicated keys.
+ *
+ * This function issues a full memory barrier before and after its
+ * atomic commit.
*/
struct cds_lfht_node *cds_lfht_add_replace(struct cds_lfht *ht,
unsigned long hash,
@@ -352,6 +367,10 @@ struct cds_lfht_node *cds_lfht_add_replace(struct cds_lfht *ht,
*
* The semantic of replacement vs lookups is the same as
* cds_lfht_add_replace().
+ *
+ * Upon success, this function issues a full memory barrier before and
+ * after its atomic commit. Upon failure, this function does not issue
+ * any memory barrier.
*/
int cds_lfht_replace(struct cds_lfht *ht,
struct cds_lfht_iter *old_iter,
@@ -377,6 +396,9 @@ int cds_lfht_replace(struct cds_lfht *ht,
* After successful removal, a grace period must be waited for before
* freeing the memory reserved for old node (which can be accessed with
* cds_lfht_iter_get_node).
+ * Upon success, this function issues a full memory barrier before and
+ * after its atomic commit. Upon failure, this function does not issue
+ * any memory barrier.
*/
int cds_lfht_del(struct cds_lfht *ht, struct cds_lfht_node *node);
@@ -391,6 +413,7 @@ int cds_lfht_del(struct cds_lfht *ht, struct cds_lfht_node *node);
* function.
* Call with rcu_read_lock held.
* Threads calling this API need to be registered RCU read-side threads.
+ * This function does not issue any memory barrier.
*/
int cds_lfht_is_node_deleted(struct cds_lfht_node *node);
@@ -400,6 +423,7 @@ int cds_lfht_is_node_deleted(struct cds_lfht_node *node);
* @new_size: update to this hash table size.
*
* Threads calling this API need to be registered RCU read-side threads.
+ * This function does not (necessarily) issue memory barriers.
*/
void cds_lfht_resize(struct cds_lfht *ht, unsigned long new_size);
@@ -407,6 +431,7 @@ void cds_lfht_resize(struct cds_lfht *ht, unsigned long new_size);
* Note: it is safe to perform element removal (del), replacement, or
* any hash table update operation during any of the following hash
* table traversals.
+ * These functions act as rcu_dereference() to read the node pointers.
*/
#define cds_lfht_for_each(ht, iter, node) \
for (cds_lfht_first(ht, iter), \
Thanks,
Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list