[ltt-dev] [rp] [URCU RFC patch 3/3] call_rcu: remove delay for wakeup scheme

Phil Howard pwh at cecs.pdx.edu
Mon Jun 6 17:26:23 EDT 2011


On Mon, Jun 6, 2011 at 12:21 PM, Mathieu Desnoyers
<mathieu.desnoyers at efficios.com> wrote:
> * Mathieu Desnoyers (mathieu.desnoyers at efficios.com) wrote:
>> I notice that the "poll(NULL, 0, 10);" delay is executed both for the RT
>> and non-RT code.  So given that my goal is to get the call_rcu thread to
>> GC memory as quickly as possible to diminish the overhead of cache
>> misses, I decided to try removing this delay for !RT: the call_rcu
>> thread then wakes up ASAP when the thread invoking call_rcu wakes it. My
>> updates jump to 76349/s (getting there!) ;).
>>
>> This improvement can be explained by a lower delay between call_rcu and
>> execution of its callback, which decrease the amount of cache used, and
>> therefore provides better cache locality.
>
> I just wonder if it's worth it: removing this delay from the !RT
> call_rcu thread can cause high-rate of synchronize_rcu() calls. So
> although there might be an advantage in terms of update rate, it will
> likely cause extra cache-line bounces between the call_rcu threads and
> the reader threads.
>
> test_urcu_rbtree 7 1 20 -g 1000000
>
> With the delay in the call_rcu thread:
> search:  1842857 items/reader thread/s (7 reader threads)
> updates:   21066 items/s (1 update thread)
> ratio: 87 search/update
>
> Without the delay in the call_rcu thread:
> search:  3064285 items/reader thread/s (7 reader threads)
> updates:   45096 items/s (1 update thread)
> ratio: 68 search/update
>
> So basically, adding the delay doubles the update performance, at the
> cost of being 33% slower for reads. My first thought is that if an
> application has very frequent updates, then maybe it wants to have fast
> updates because the update throughput is then important. If the
> application has infrequent updates, then the reads will be fast anyway,
> because rare call_rcu invocation will trigger less cache-line bounce
> between readers and writers. Any other thoughts on this trade-off and
> how to deal with it ?
>

Did I miss something here? It looks like you more than doubled the
update rate and almost doubled the lookup rate. The search/update
ration is less, but if both the raw rates improved so much, how is
this a bad thing?

-phil

> Thanks,
>
> Mathieu
>
>
>>
>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
>> ---
>>  urcu-call-rcu-impl.h |    3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> Index: userspace-rcu/urcu-call-rcu-impl.h
>> ===================================================================
>> --- userspace-rcu.orig/urcu-call-rcu-impl.h
>> +++ userspace-rcu/urcu-call-rcu-impl.h
>> @@ -242,7 +242,8 @@ static void *call_rcu_thread(void *arg)
>>               else {
>>                       if (&crdp->cbs.head == _CMM_LOAD_SHARED(crdp->cbs.tail))
>>                               call_rcu_wait(crdp);
>> -                     poll(NULL, 0, 10);
>> +                     else
>> +                             poll(NULL, 0, 10);
>>               }
>>       }
>>       call_rcu_lock(&crdp->mtx);
>>
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com
>
> _______________________________________________
> rp mailing list
> rp at svcs.cs.pdx.edu
> http://svcs.cs.pdx.edu/mailman/listinfo/rp
>




More information about the lttng-dev mailing list