[lttng-dev] Userspace RCU: workqueue with batching, cheap wakeup, and work stealing
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Thu Oct 23 19:24:17 EDT 2014
I just implemented a urcu_workqueue_steal_all(), and the
benchmark test program now uses this when there is congestion
in the queue.
Thanks,
Mathieu
----- Original Message -----
> From: "Mathieu Desnoyers" <mathieu.desnoyers at efficios.com>
> To: "Ben Maurer" <bmaurer at fb.com>
> Cc: "Yannick Brosseau" <yannick.brosseau at fb.com>, "lttng-dev" <lttng-dev at lists.lttng.org>, "Paul E. McKenney"
> <paulmck at linux.vnet.ibm.com>
> Sent: Thursday, October 23, 2014 6:48:52 PM
> Subject: Re: [lttng-dev] Userspace RCU: workqueue with batching, cheap wakeup, and work stealing
>
> Interesting point about bufferbloat!
>
> I've just pushed the "approximate queue upper bound" feature
> in the last commit.
>
> Going further, when the queue is full, there are indeed a few
> options:
>
> 1) sleep for a few ms and retry enqueue,
> 2) grab the entire content of the global workqueue, and discard
> its work elements one by one,
> 3) in addition to (2), also steal work from all worker
> threads, and discard their work elements.
>
> Making the dispatcher act as a dummy "worker thread" would
> allow it to easily accomplish (2). We'd need some tweaks
> to "steal all worker's work elements" (3) (new API). This
> could be presented as a "urcu_queue_steal_all" or something
> like that, and then the dispatcher could iterate on the
> work items and either discard them, or perform the appropriate
> socket action.
>
> Thoughts ?
>
> Thanks,
>
> Mathieu
>
>
> ----- Original Message -----
> > From: "Ben Maurer" <bmaurer at fb.com>
> > To: "Mathieu Desnoyers" <mathieu.desnoyers at efficios.com>, "Lai Jiangshan"
> > <laijs at cn.fujitsu.com>
> > Cc: "lttng-dev" <lttng-dev at lists.lttng.org>, "Paul E. McKenney"
> > <paulmck at linux.vnet.ibm.com>, "Yannick Brosseau"
> > <yannick.brosseau at fb.com>
> > Sent: Thursday, October 23, 2014 6:09:11 PM
> > Subject: RE: [lttng-dev] Userspace RCU: workqueue with batching, cheap
> > wakeup, and work stealing
> >
> > Bounds are pretty critical :-), often during operational incidents we will
> > get large buildups in our queues and these cause problems.
> >
> > For us, one of the most critical things isn't the memory usage but the
> > delay
> > caused to the client. For example, if a server has a queue that incoming
> > requests are put into if that queue grows large clients experience large
> > delays. Since most calls to the server have a short timeout (seconds), we'd
> > rather prevent items from entering the queue so that we fail fast.
> >
> > Some of our applications switch to LIFO processing of work items when the
> > queue is large. What this does is to focus the processing effort on recent
> > requests -- ones which will hopefully get back to the user in time for them
> > to see a response.
> >
> > Long story short: when a queue is overloaded, we'd rather drop some
> > requests
> > quickly and serve the other requests with minimal queuing delay. Think of
> > queues as bufferbloat applied to work items. In fact, we have experimented
> > with some of the bufferbloat techniques on our work queues (specifically,
> > CoDEL)
> >
> > -b
> > ________________________________________
> > From: Mathieu Desnoyers [mathieu.desnoyers at efficios.com]
> > Sent: Thursday, October 23, 2014 2:57 PM
> > To: Lai Jiangshan
> > Cc: lttng-dev; Paul E. McKenney; Ben Maurer; Yannick Brosseau
> > Subject: Re: [lttng-dev] Userspace RCU: workqueue with batching, cheap
> > wakeup, and work stealing
> >
> > The next thing I'm wondering now: should we include an
> > optional bound to the global workqueue size in the API ?
> >
> > I've just had cases here where I stress test the queue
> > with very frequent dispatch, and it can fill up memory
> > relatively quickly if the workers have a large amount of
> > work to do per work-item.
> >
> > I think the usual way to do this would be to make the
> > behavior nonblocking when the queue is full, so the
> > dispatcher can take action and move the work away to
> > another machine, or report congestion.
> >
> > Thoughts ?
> >
> > Thanks,
> >
> > Mathieu
> >
> > --
> > Mathieu Desnoyers
> > EfficiOS Inc.
> > http://www.efficios.com
> >
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list