[lttng-dev] [RFC URCU PATCH] wfqueue: ABI v1, simplify implementation, 2.3x to 2.6x performance boost

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Tue Aug 21 08:12:26 EDT 2012


* Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
> On 08/21/2012 09:03 AM, Mathieu Desnoyers wrote:
> > * Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
> >> On 08/20/2012 09:16 PM, Mathieu Desnoyers wrote:
> >>> * Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
> >>>> On 08/16/2012 05:31 AM, Mathieu Desnoyers wrote:
> >>>>> This work is derived from the patch from Lai Jiangshan submitted as
> >>>>> "urcu: new wfqueue implementation"
> >>>>> (http://lists.lttng.org/pipermail/lttng-dev/2012-August/018379.html)
> >>>>>
> >>>>
> >>>
> >>> Hi Lai,
> >>>
> >>>> Hi, Mathieu
> >>>>
> >>>> please add this part to your patch.
> >>>>
> >>>> Changes(item5 has alternative):
> >>>>
> >>>> 1) Reorder the head and the tail in struct cds_wfq_queue
> >>>>    Reason: wfq is enqueue-preference
> >>>
> >>> Is this a technical improvement over the original order ? Both cache
> >>> lines are aligned, so it should not matter which one comes first. So I'm
> >>> not sure why we should favor one order vs the other.
> >>
> >> struct astruct {
> >> 	int foo; /* field offset = 0 */
> >> 	int bar; /* field offset = 4 */
> >> };
> >>
> >> Access to p->foo and access to p->bar are different in CPU instruction.
> >> Access to p->bar will need to add a offset to p. (the addition may cost a very small time)
> >> instructions will be probably longer than the instructions access to p->foo.
> >>
> >> If the code of access to p->bar is significant more than the code of access to p->foo,
> >> we should swap p->bar and p->foo field's order to shorten the total instruction size
> >> to reduce footprint.
> >>
> >> Our enqueue is inline function, when this inline function is expanded in every call site,
> >> the code of access to the tail will much significant, so we need to put the tail at first.
> >>
> > 
> > Very good point! I'm pulling this change too.
> > 
> >>
> >>
> >>>
> >>>> 2) Reorder the code of ___cds_wfq_next_blocking()
> >>>>    Reason: short code, better readability
> >>>
> >>> Yes, I agree with this change. Can you extract it into a separate patch ?
> >>
> >> You merge it into your patch which ___cds_wfq_next_blocking() is introduced.
> >>
> > 
> > OK. done,
> > 
> >>>
> >>>>
> >>>> 3) Add cds_wfq_dequeue_[un]lock
> >>>>    Reason: the fields of struct cds_wfq_queue are not exposed to users of lib,
> >>>> 	but some APIs needs to have this lock held. Add this locking function
> >>>> 	to allow the users use such APIs
> >>>
> >>> I agree with this change too. You can put it into the same patch as (2).
> >>
> >> You merge it into your patch.
> > 
> > OK, done too.
> > 
> > The current version (volatile branch) is at 
> > 
> > git://git.dorsal.polymtl.ca/~compudj/userspace-rcu
> > branch: urcu/wfqueue-volatile
> > 
> > Thanks!
> > 
> 
> 
> We can merge all things into a single patch until the discussion are finished.
> I don't want to have several commits in the git-tree that every
> commit fix some thing of the previous commit.

Yes, I agree.

> 
> When the big single patch is done, we can separate the patch properly.
> It will help others to understand the code from git-log.

Yes, that can be done. Hopefully we'll manage to find the time to do
this.

> 
> I will be very appreciate if one or two small patch(s) is committed
> with my name.

Of course! The only reason why I set the From on the current patch to my
own name is because I did not want to give you the blame for my own
errors (although I kept the credits to you in the changelog). Splitting
the work in sub-patches will make it easier to credit each of us for
specific portions of the code.

Thanks!

Mathieu

> 
> Thanks,
> Lai.

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list