[ltt-dev] [PATCH] urcu-qsbr: avoid useless futex wakeups and burning CPU for long grace periods

Paolo Bonzini pbonzini at redhat.com
Sun Aug 14 13:37:08 EDT 2011


On 08/12/2011 08:08 PM, Mathieu Desnoyers wrote:
>> It is not accelerating synchronize_rcu().  It does two things:
>>
>> 1) by using futexes, it avoids burning CPU when a grace period is long.
>> It is actually effective even if the grace period is _not_ so long: 100
>> walks through the thread list take less than a millisecond, and you do
>> not want to call rcu_quiescent_state() that often;
>>
>> 2) once you're always using futexes, if you have frequent quiescent
>> states in one thread and more rare quiescent states in another, the
>> former thread will uselessly call FUTEX_WAKE on each quiescent state,
>> even though it is already in the next grace period.
>
> OK, so this might benefit to URCU implementations other than qsbr too,
> right ?

Yes.  I started from urcu-qsbr because that's what I am using, and 
because it's the simplest implementation.

> I think I did not convey my idea fully:
>
> this would take care of re-decrementing the gp_futex value for the first
> wait attempt and the following ones:
>
>                  if (wait_loops>= RCU_QS_ACTIVE_ATTEMPTS) {
>                          uatomic_dec(&gp_futex);
>                          /* Write futex before read reader_gp */
>                          cmm_smp_mb();
>                  }
> [...]
>
> and this would be waiting for a wakeup:
>
>                         if (wait_loops>= RCU_QS_ACTIVE_ATTEMPTS) {
>                                  wait_gp();
>                          } else {
>
> But I agree that this does not handle readers with quite different
> period length very well, because short-lived periods will trigger a lot
> of useless futex wakeups.

Yes, that only covers (1) above.

>> @@ -136,7 +137,11 @@ extern int32_t gp_futex;
>>    */
>>   static inline void wake_up_gp(void)
>>   {
>> -	if (unlikely(uatomic_read(&gp_futex) == -1)) {
>> +	if (unlikely(_CMM_LOAD_SHARED(rcu_reader.waiting))) {
>> +		_CMM_STORE_SHARED(rcu_reader.waiting, 0);
>
> Commenting this memory barrier would be helpful too.

Ok.

Paolo




More information about the lttng-dev mailing list