[ltt-dev] lock-free data structures (was Re: [PATCH 12/12] centralize definition of BITS_PER_LONG)
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Fri Feb 19 18:51:39 EST 2010
On Fri, Feb 19, 2010 at 08:25:04PM +0100, Paolo Bonzini wrote:
>
>>> Some time we should also add double-long compare-and-swap, that's very
>>> useful for lock-free lists.
>>
>> Do you have pointers to papers describing this double-wide CAS
>> linked-list structure ?
>
> I found
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.86.8494&rep=rep1&type=pdf
> (page 4) which is a lock-free queue.
>
> The two words are used like this:
>
> struct gen_ptr {
> long gen;
> void *ptr;
> }
>
> where gen is always incremented whenever ptr is changed.
Doesn't the use of RCU prevent the ABA scenario, and doesn't that make
DCAS unnecessary?
Thanx, Paul
>> Yes, good idea! Although this won't be available on all architectures.
>> We might have to think of a mutex-based compatibility layer for these.
>
> You don't need a mutex if you use it for a lock-free queue. You're just
> better off providing two version of the queue, one for double-word CAS and
> a not-really-lock-free one for simple CAS.
>
> The elementary operation in that algorithm are reading a struct gen_ptr in
> a way that can be compared-and-swapped later, and doing a compare-and-swap
> that checks consistency of generation and pointer and increments the
> generation. If you have double-word CAS, everything is very simple because
> consistency is taken care of by the double-word CAS. So with double-word
> CAS the above are
>
> void *atomicRead (struct gen_ptr *val)
> {
> return val->ptr;
> }
>
> and
>
> long atomicCmpXchgIncrGeneration (struct gen_ptr *val, void *old_ptr,
> void *new_ptr)
> {
> int old_gen = val->gen;
> struct gen_ptr read =
> atomicCmpXchgDouble (val, old_gen, old_ptr,
> old_gen + 1, new_ptr)
> return read.gen;
> }
>
> but even without double-word CAS you can do it because if the generations
> match, so will the pointers. So the CAS will be on the generation in this
> case, and you use the low-order bit as a tag bit for "operation in
> progress". It would go something like this (untested):
>
> struct gen_ptr atomicRead (struct gen_ptr *val)
> {
> struct gen_ptr read;
> for (;;) {
> read.gen = val->gen;
> if (read.gen & 1) {
> /* Unlucky. */
> continue;
> } else {
> read.ptr = val->ptr;
> rmb ();
> if (read.gen == val->gen)
> return read;
> }
> }
> }
>
> long
> atomicCmpXchgWithGeneration (struct gen_ptr *val, long old_gen,
> void *new_ptr)
> {
> assert (!(old_gen & 1));
> int read_gen = CAS (&val->gen, old_gen, old_gen + 3);
> if (read_gen == old_gen) {
> /* We are the only ones that have access to val->ptr here.
> Since the low bit is 1, other threads will lock on
> atomicRead and fail the CAS in their
> atomicCmpXchgWithGeneration. Update val->ptr and untag
> the generation number. */
> val->ptr = new_ptr;
> wmb ();
> val->gen--;
> }
> return read_gen;
> }
>
> Now, while the above is certainly fun, I don't think it belongs in liburcu.
> However...
>
>> I wonder if we could use a clever RCU structure
>> to mimick the double-wide CAS.
>
> I don't know, I think lock-free queues and RCU are used for very different
> access patterns (and if you need lock-free, chances are that lock-free will
> kill your cache and not give you _that much_ performance, so maybe you'd
> better rethink everything...). However I'm not expert at all.
>
> Still, related to this, the paper above gives only half of the story. The
> lock-free queue needs a free list traditionally, and the paper mentions
> using a lock-free stack as a free list (BTW the lock-free stack doesn't
> need double-word CAS). But then, the nodes on the freelist can never be
> returned to the OS because you can always race with the operations on the
> lock-free queue.
>
> So, maybe RCU could be used to detect quiescent periods when there is no
> operation being performed on the lock-free queue. *Both enqueue and
> dequeue* would be wrapped in rcu_read_{,un}lock(), unlike normal RCU
> algorithms because this one is lock free. Then, in dequeue you'd replace
> the paper's line D19 with
>
> defer_rcu(free, head);
>
> Hmm...
>
> Paolo
More information about the lttng-dev
mailing list