[ltt-dev] [RFC PATCH 0/7] priority-boost urcu

Wed Aug 17 06:41:01 EDT 2011

* Paolo Bonzini (pbonzini at redhat.com) wrote:
> On 08/16/2011 12:58 AM, Lai Jiangshan wrote:
>> These series patches implelent a priority-boost urcu
>> based on pi-lock.
>>
>> Some other locks(especial rcu_gp_lock) should be also
>> priority-aware, these patches did touch them and make
>> the patchset simpler.
>
> While really cool, I found this patchset overly complex.

Yep, I fully agree with you. I am not comfortable with the complexity
level this patchset adds, which means we need to split it into nice,
simple, easy to understand abstractions.

> What we should introduce is abstractions over futexes.

Yes, I fully agree with this too. Hence my earlier proposal of creating
a clean wait/wakeup abstraction, which could possibly include our own
implementation of PI support, and maybe that would involve implementing
our own PI mutexes rather than just re-using pthread mutexes. Thoughts ?

>  This is what I  
> did to experimentally port URCU to QEMU---my secret goal since commit  
> 806f811

Ah! So this is your secret plan! ;-) ;-)

> (use kernel style makefile output, 2010-03-01). :)  Our use of  
> futexes is exceptionally similar to a Windows manual-reset event (yes,  
> Windows:  
> http://msdn.microsoft.com/en-us/library/system.threading.manualresetevent%28v=vs.80%29.aspx). 

Could be. I haven't looked at their API.

> In QEMU I added the manual-reset event and use it in the implementation 
> of RCU.
>
> By introducing an abstraction for this, we can make the code a lot  
> clearer and secondarily gain in portability.  For QEMU portability was  
> actually my primary goal, but URCU might have different priorities. :)

Portability is good. I actually added support for FreeBSD 8.2 recently.
Although it does not mean that I plan to add APIs just to match all the
OSes out there, of course. :-)

> PI futex support can also be implemented in the same framework.
>
> By the way, it is my impression that MB (perhaps MEMBARRIER too?)

MB and MEMBARRIER schemes are very similar, which the main difference
being that MB has explicit memory barriers on both synchronize_rcu and
read lock/unlock, while MEMBARRIER has OS-assisted IPI-based memory
barriers on the synchronize_rcu side (I'm still awaiting to get enough
traction with URCU to get sys_membarrier into Linux by the way), with
a simple compiler barrier on the rcu read lock/unlock.

> is way way more similar to QSBR than to SIGNAL:

I disagree,

>    MB rcu_read_unlock = QSBR rcu_thread_offline + nesting count
>    MB rcu_read_lock   = QSBR rcu_thread_online + nesting count

The statement above applies to all flavors of URCU. There is a clear
link between offline/online and nested read lock/unlock. We can see them
as the two sides of the same counter: one counts the reasons why a
thread is within a rcu critical section, while the other keeps track of
the reasons why a thread is not in a rcu critical section.

But MB/MEMBARRIER and SIGNAL schemes have all been derived from the same
2-phase grace-period scheme, based on a lock/unlock nesting count, while
the QSBR implementation is a different beast that requires periodic
invocation of rcu_quiescent_state() by each application thread, which
makes it unsuitable for use of RCU within libraries.

> Perhaps moving around code could make the code simpler?  Following the  
> master/slave memory barrier functions is quite hard, and this is  
> complicated by the KICK_READER_LOOPS that (if I understand correctly)  
> makes little sense for non-SIGNAL models.

The current segmentation of code (qsbr separated from 2-phase) seems
more natural to me. Maybe we could further clean up the urcu.c
implementation and move the SIGNAL-specific code into its own ifdef'd
functions, so it does not pollute the rest of the code ? Thoughts ?

Thanks,

Mathieu

>
> Paolo

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com