[ltt-dev] [RFC PATCH 0/7] priority-boost urcu

Thu Aug 18 00:37:49 EDT 2011

On 08/17/2011 04:46 PM, Paolo Bonzini wrote:
> On 08/16/2011 12:58 AM, Lai Jiangshan wrote:
>> These series patches implelent a priority-boost urcu
>> based on pi-lock.
>>
>> Some other locks(especial rcu_gp_lock) should be also
>> priority-aware, these patches did touch them and make
>> the patchset simpler.
> 
> While really cool, I found this patchset overly complex.
> 
> What we should introduce is abstractions over futexes.  This is what I did to experimentally port URCU to QEMU---my secret goal since commit 806f811 (use kernel style makefile output, 2010-03-01). :)  Our use of futexes is exceptionally similar to a Windows manual-reset event (yes, Windows: http://msdn.microsoft.com/en-us/library/system.threading.manualresetevent%28v=vs.80%29.aspx).  In QEMU I added the manual-reset event and use it in the implementation of RCU.
> 
> By introducing an abstraction for this, we can make the code a lot clearer and secondarily gain in portability.  For QEMU portability was actually my primary goal, but URCU might have different priorities. :)
> 
> PI futex support can also be implemented in the same framework.
How?

Challenges of userspace priority-boost urcu.

No matter how to design a urcu, update site have to wait for the started read site.
Normal waiting pattern is:

-----------------------------------
thread1			thread2 (one of read site)
...			...
xx_wait(&something);	xx_wake(&something);
...			...
------------------------------------

Even thread1 is a higher priority thread, thread2 will not be boosted,
because the OS does not know which thread will do "wake(&something);"

Three approaches can achieve it in my mind.
	1) tell the OS which thread need to be boosted when waiting.
	2) compete/wait a pi-lock which already held by thread2
	3) (hide, hard to explain, require kernel changed)

1) is not acceptable, the OS has no such API/syscall, but 1) can be implemented over 2)
2) is simpler.

-----------------------------------
thread1			thread2 (pi_lock is held by thread2)
...			...
lock(&pi_lock);	
unlock(&pi_lock);	unlock(&pi_lock); /* wake thread1 */
...			...
------------------------------------

But when thread2 requires the pi_lock *correctly* back for next usage?
So proxy APIs is required, so I add such complexity for proxy APIs.

-----------------------------------
thread1				thread2 (pi_lock is held by thread2)
...				...
proxy_lock(&pi_lock, thread2)	
lock(&pi_lock);	
unlock(&pi_lock);		unlock(&pi_lock); /* wake thread1 */
...				...
------------------------------------

Paul will find it is the same as rcu_boost() in linux kernel!
(I forgot to told the truth, I stole codes from Paul, kernel/futex, pthread etc...)

Thanks,
Lai

> 
> By the way, it is my impression that MB (perhaps MEMBARRIER too?) is way way more similar to QSBR than to SIGNAL:
> 
>    MB rcu_read_unlock = QSBR rcu_thread_offline + nesting count
>    MB rcu_read_lock   = QSBR rcu_thread_online + nesting count
> 
> Perhaps moving around code could make the code simpler?  Following the master/slave memory barrier functions is quite hard, and this is complicated by the KICK_READER_LOOPS that (if I understand correctly) makes little sense for non-SIGNAL models.
> 
> Paolo
>