[lttng-dev] [rp] rculfhash ordering guarantees
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Mon Apr 23 18:26:09 EDT 2012
On Mon, Apr 23, 2012 at 03:45:32PM -0400, Mathieu Desnoyers wrote:
> Hi Paul,
>
> Here is the updated text I plan for the next update. Comments are
> welcome, thanks !
Looks much improved! The inevitable questions and comments interspersed.
Thanx, Paul
> Mathieu
>
> * Ordering Guarantees:
> *
> * To discuss these guarantees, we first define "read" operation as any
> * of the the basic cds_lfht_lookup, cds_lfht_next_duplicate,
> * cds_lfht_first, cds_lfht_next operation.
> *
> * We define "read traversal" operation as any of the following
> * group of operations
> * - cds_lfht_lookup followed by iteration with cds_lfht_next_duplicate
> * - cds_lfht_first followed iteration with cds_lfht_next
Can't cds_lfht_next() and cds_lfht_next_duplicate() be intermixed?
Something like the following:
rcu_read_lock();
p = cds_lfht_lookup(...);
while (p != NULL) {
do_something(p);
if (p->dups_of_interest)
p = cds_lfht_next_duplicate(...);
else
p = cds_lfht_next(...);
}
rcu_read_unlock();
> *
> * We define "write" operations as any of cds_lfht_add,
> * cds_lfht_add_unique, cds_lfht_add_replace, cds_lfht_del.
> *
> * We define "prior" and "later" node as nodes observable by reads and
> * read traversals respectively before and after a write or sequence of
> * write operations.
> *
> * It is implicit from the requirement of the read, read traversal, and
> * write operations that RCU read-side locks need to be held across
> * traversals, and between a read or read traversal and invocation of a
> * write that receives a node as argument.
This paragraph is implying a lot... How about the following?
Hash-table operations are often chained, for example, the
pointer returned by a cds_lfht_lookup() might be passed to a
cds_lfht_next(), whose return value might in turn be passed to
another hash-table operation. This entire chain of operations
must be enclosed by a pair of matching rcu_read_lock() and
rcu_read_unlock() operations.
> *
> * The following ordering guarantees are offered by this hash table:
> *
> * A.1) "read" after "write": if there is ordering between a write and a
> * later read, then the read is guaranteed to see the write or some
> * later write.
> * A.2) "read traversal" after "write": given that there is dependency
> * ordering between reads in a "read traversal", if there is
> * ordering between a write and the first read of the traversal,
> * then the "read traversal" is guaranteed to see the write or
> * some later write.
> * B.1) "write" after "read": if there is ordering between a read and a
> * later write, then the read will never see the write.
> * B.2) "write" after "read traversal": given that there is dependency
> * ordering between reads in a "read traversal", if there is
> * ordering between the last read of the traversal and a later
> * write, then the "read traversal" will never see the write.
> * C) "write" while "read traversal": if a write occurs during a "read
> * traversal", the traversal may, or may not, see the write.
> * D) "write" vs "write": writes occur atomically between their
> * invocation and the moment they return. There is a full memory
> * barrier before and after the atomic "commit" point of each
> * write.
This is a description of the implementation, while the others are guarantees.
Of course, that begs the question of what the guarantee is... Maybe the
following?
D.1 "write" after "write": if there is ordering between a write and
a later write, then the later write is guaranteed to see the
effects of the first write.
D.2 Concurrent "write" pairs: The system will assign an arbitrary order
to any pair of concurrent conflicting writes. Non-conflicting
writes (for example, to different keys) are unordered.
> * E) If a grace period separates a "del" or "replace" operation
> * and a subsequent operation, then that subsequent operation is
> * guaranteed not to see the removed item.
> * F) Uniqueness guarantee: given a hash table that does not contain
> * duplicate items for a given key, there will only be one item in
> * the hash table after an arbitrary sequence of add_unique and/or
> * add_replace operations. Note, however, that a pair of
> * concurrent read operations might well access two different items
> * with that key.
> * G.1) Given a hash table that does not contain duplicate items for a
> * given key, if a pair of lookups for a given key are ordered
> * (e.g. by a memory barrier), then the second lookup will return
> * the same node as the previous lookup, or some later node.
> * G.2) A "read traversal" that starts after the end of a prior "read
> * traversal" (ordered by memory barriers) is guaranteed to see the
> * same nodes as the previous traversal, or some later nodes.
> *
> * Progress guarantees:
> *
> * * Reads are wait-free. These operations always move forward in the
> * hash table linked list, and this list has no loop.
> * * Writes are lock-free. Any retry loop performed by a write operation
> * is triggered by progress made within another update operation.
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com
>
> _______________________________________________
> rp mailing list
> rp at svcs.cs.pdx.edu
> http://svcs.cs.pdx.edu/mailman/listinfo/rp
>
More information about the lttng-dev
mailing list