[ltt-dev] [PATCH 0/3] Fixes for threads that are readers and writers, and lock-free queue example

Mathieu Desnoyers compudj at krystal.dyndns.org
Mon Mar 1 15:14:27 EST 2010


* Paolo Bonzini (pbonzini at redhat.com) wrote:
> On 03/01/2010 08:16 PM, Mathieu Desnoyers wrote:
>> It is illegal to call rcu_unregister_thread from within a RCU read-side
>> critical section. So I don't see the point in adding support for
>> explicit thread offlining. Maybe it's a documentation issue that lead
>> you to assume that unregistering a thread from a read-side C.S. is
>> valid ?
>
> It was not within, but still it deadlocked.  You can try with patch 3/3  
> alone, and I debugged it to two issues.


There is a bug in the dequeue implementation:

void *dequeue(struct queue *q, bool *not_empty)
{
        bool dummy;
        if (!not_empty)
                not_empty = &dummy;

        rcu_read_lock();
        *not_empty = false;
        for (;;) {
                void *data;
                struct node *head = rcu_dereference(q->head);
                struct node *tail = rcu_dereference(q->tail);
                struct node *next = rcu_dereference(head->next);

                if (head != q->head) {
                        /* A change occurred while reading the values.  */
                        continue;
                }

                if (head == tail) {
                        /* If all three are consistent, the queue is empty.  */
                        if (!next)
                                return NULL;

---> returns without unlocking the RCU read lock.

These unpaired lock/unlock are causing this problem.

If you look at the gdb backtrace of the "deadlock", you'll notice that
one thread is stucked calling synchronize_rcu() with its own rcu nest
count != 0.

Thanks,

Mathieu


                        /* Help moving tail further.  */
                        uatomic_cmpxchg(&q->tail, tail, next);
                        continue;
                }

                data = next->data;
                if (uatomic_cmpxchg(&q->head, head, next) == head) {
                        /* Next remains as a dummy node, head is freed.  */
                        rcu_read_unlock();
                        *not_empty = true;
                        free_node (head);
                        return data;
                }
        }
}


>
> The first one is synchronize_rcu not offlining the current thread.  Was  
> the "the same thread reads and then (outside a read-side CS) writes"  
> scenario ever tested outside urcu-qsbr?
>
> The second is that synchronize_rcu will wait for the current critical  
> sections to finish (using the futex), but this has two problems in turn:
>
> - if no reader thread has a critical section anymore, no one will call  
> wake_up_gp and the writer will keep waiting on the futex forever.  This  
> is fixed by _rcu_thread_offline calling wake_up_gp
>
> - if a single thread has already exited, that one thread will never  
> manage to toggle the parity twice!  This is fixed by offlining the  
> thread before rcu_unregister_thread.  It is still broken even after my  
> patches for urcu-bp, I suspect we need to move rcu_gc_registry somewhere  
> within update_counter_and_wait but I wasn't brave enough.
>
>> By the way, it's also illegal to call synchronize_rcu() from a RCU
>> read-side C.S.
>
> I know that. ;-)  My code unlocks before calling synchronize_rcu.
>
>> But I don't see exactly where your lock-free list needs
>> to do any of these operations that would result in deadlocks ?
>
> No, it doesn't do anything invalid. :-)
>
> Paolo
>

-- 
Mathieu Desnoyers
Operating System Efficiency Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list