[lttng-dev] [RFC URCU PATCH] wfqueue: ABI v1, simplify implementation, 2.3x to 2.6x performance boost

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Mon Aug 20 21:03:51 EDT 2012


* Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
> On 08/20/2012 09:16 PM, Mathieu Desnoyers wrote:
> > * Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
> >> On 08/16/2012 05:31 AM, Mathieu Desnoyers wrote:
> >>> This work is derived from the patch from Lai Jiangshan submitted as
> >>> "urcu: new wfqueue implementation"
> >>> (http://lists.lttng.org/pipermail/lttng-dev/2012-August/018379.html)
> >>>
> >>
> > 
> > Hi Lai,
> > 
> >> Hi, Mathieu
> >>
> >> please add this part to your patch.
> >>
> >> Changes(item5 has alternative):
> >>
> >> 1) Reorder the head and the tail in struct cds_wfq_queue
> >>    Reason: wfq is enqueue-preference
> > 
> > Is this a technical improvement over the original order ? Both cache
> > lines are aligned, so it should not matter which one comes first. So I'm
> > not sure why we should favor one order vs the other.
> 
> struct astruct {
> 	int foo; /* field offset = 0 */
> 	int bar; /* field offset = 4 */
> };
> 
> Access to p->foo and access to p->bar are different in CPU instruction.
> Access to p->bar will need to add a offset to p. (the addition may cost a very small time)
> instructions will be probably longer than the instructions access to p->foo.
> 
> If the code of access to p->bar is significant more than the code of access to p->foo,
> we should swap p->bar and p->foo field's order to shorten the total instruction size
> to reduce footprint.
> 
> Our enqueue is inline function, when this inline function is expanded in every call site,
> the code of access to the tail will much significant, so we need to put the tail at first.
> 

Very good point! I'm pulling this change too.

> 
> 
> > 
> >> 2) Reorder the code of ___cds_wfq_next_blocking()
> >>    Reason: short code, better readability
> > 
> > Yes, I agree with this change. Can you extract it into a separate patch ?
> 
> You merge it into your patch which ___cds_wfq_next_blocking() is introduced.
> 

OK. done,

> > 
> >>
> >> 3) Add cds_wfq_dequeue_[un]lock
> >>    Reason: the fields of struct cds_wfq_queue are not exposed to users of lib,
> >> 	but some APIs needs to have this lock held. Add this locking function
> >> 	to allow the users use such APIs
> > 
> > I agree with this change too. You can put it into the same patch as (2).
> 
> You merge it into your patch.

OK, done too.

The current version (volatile branch) is at 

git://git.dorsal.polymtl.ca/~compudj/userspace-rcu
branch: urcu/wfqueue-volatile

Thanks!

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list