[lttng-dev] [RFC URCU PATCH] wfqueue: ABI v1, simplify implementation, 2.3x to 2.6x performance boost
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Mon Aug 20 21:03:51 EDT 2012
* Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
> On 08/20/2012 09:16 PM, Mathieu Desnoyers wrote:
> > * Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
> >> On 08/16/2012 05:31 AM, Mathieu Desnoyers wrote:
> >>> This work is derived from the patch from Lai Jiangshan submitted as
> >>> "urcu: new wfqueue implementation"
> >>> (http://lists.lttng.org/pipermail/lttng-dev/2012-August/018379.html)
> >>>
> >>
> >
> > Hi Lai,
> >
> >> Hi, Mathieu
> >>
> >> please add this part to your patch.
> >>
> >> Changes(item5 has alternative):
> >>
> >> 1) Reorder the head and the tail in struct cds_wfq_queue
> >> Reason: wfq is enqueue-preference
> >
> > Is this a technical improvement over the original order ? Both cache
> > lines are aligned, so it should not matter which one comes first. So I'm
> > not sure why we should favor one order vs the other.
>
> struct astruct {
> int foo; /* field offset = 0 */
> int bar; /* field offset = 4 */
> };
>
> Access to p->foo and access to p->bar are different in CPU instruction.
> Access to p->bar will need to add a offset to p. (the addition may cost a very small time)
> instructions will be probably longer than the instructions access to p->foo.
>
> If the code of access to p->bar is significant more than the code of access to p->foo,
> we should swap p->bar and p->foo field's order to shorten the total instruction size
> to reduce footprint.
>
> Our enqueue is inline function, when this inline function is expanded in every call site,
> the code of access to the tail will much significant, so we need to put the tail at first.
>
Very good point! I'm pulling this change too.
>
>
> >
> >> 2) Reorder the code of ___cds_wfq_next_blocking()
> >> Reason: short code, better readability
> >
> > Yes, I agree with this change. Can you extract it into a separate patch ?
>
> You merge it into your patch which ___cds_wfq_next_blocking() is introduced.
>
OK. done,
> >
> >>
> >> 3) Add cds_wfq_dequeue_[un]lock
> >> Reason: the fields of struct cds_wfq_queue are not exposed to users of lib,
> >> but some APIs needs to have this lock held. Add this locking function
> >> to allow the users use such APIs
> >
> > I agree with this change too. You can put it into the same patch as (2).
>
> You merge it into your patch.
OK, done too.
The current version (volatile branch) is at
git://git.dorsal.polymtl.ca/~compudj/userspace-rcu
branch: urcu/wfqueue-volatile
Thanks!
Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list