[lttng-dev] Userspace RCU: workqueue with batching, cheap wakeup, and work stealing

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Thu Oct 23 19:24:17 EDT 2014


I just implemented a urcu_workqueue_steal_all(), and the
benchmark test program now uses this when there is congestion
in the queue.

Thanks,

Mathieu

----- Original Message -----
> From: "Mathieu Desnoyers" <mathieu.desnoyers at efficios.com>
> To: "Ben Maurer" <bmaurer at fb.com>
> Cc: "Yannick Brosseau" <yannick.brosseau at fb.com>, "lttng-dev" <lttng-dev at lists.lttng.org>, "Paul E. McKenney"
> <paulmck at linux.vnet.ibm.com>
> Sent: Thursday, October 23, 2014 6:48:52 PM
> Subject: Re: [lttng-dev] Userspace RCU: workqueue with batching, cheap wakeup, and work stealing
> 
> Interesting point about bufferbloat!
> 
> I've just pushed the "approximate queue upper bound" feature
> in the last commit.
> 
> Going further, when the queue is full, there are indeed a few
> options:
> 
> 1) sleep for a few ms and retry enqueue,
> 2) grab the entire content of the global workqueue, and discard
>    its work elements one by one,
> 3) in addition to (2), also steal work from all worker
>    threads, and discard their work elements.
> 
> Making the dispatcher act as a dummy "worker thread" would
> allow it to easily accomplish (2). We'd need some tweaks
> to "steal all worker's work elements" (3) (new API). This
> could be presented as a "urcu_queue_steal_all" or something
> like that, and then the dispatcher could iterate on the
> work items and either discard them, or perform the appropriate
> socket action.
> 
> Thoughts ?
> 
> Thanks,
> 
> Mathieu
> 
> 
> ----- Original Message -----
> > From: "Ben Maurer" <bmaurer at fb.com>
> > To: "Mathieu Desnoyers" <mathieu.desnoyers at efficios.com>, "Lai Jiangshan"
> > <laijs at cn.fujitsu.com>
> > Cc: "lttng-dev" <lttng-dev at lists.lttng.org>, "Paul E. McKenney"
> > <paulmck at linux.vnet.ibm.com>, "Yannick Brosseau"
> > <yannick.brosseau at fb.com>
> > Sent: Thursday, October 23, 2014 6:09:11 PM
> > Subject: RE: [lttng-dev] Userspace RCU: workqueue with batching, cheap
> > wakeup, and work stealing
> > 
> > Bounds are pretty critical :-), often during operational incidents we will
> > get large buildups in our queues and these cause problems.
> > 
> > For us, one of the most critical things isn't the memory usage but the
> > delay
> > caused to the client. For example, if a server has a queue that incoming
> > requests are put into if that queue grows large clients experience large
> > delays. Since most calls to the server have a short timeout (seconds), we'd
> > rather prevent items from entering the queue so that we fail fast.
> > 
> > Some of our applications switch to LIFO processing of work items when the
> > queue is large. What this does is to focus the processing effort on recent
> > requests -- ones which will hopefully get back to the user in time for them
> > to see a response.
> > 
> > Long story short: when a queue is overloaded, we'd rather drop some
> > requests
> > quickly and serve the other requests with minimal queuing delay. Think of
> > queues as bufferbloat applied to work items. In fact, we have experimented
> > with some of the bufferbloat techniques on our work queues (specifically,
> > CoDEL)
> > 
> > -b
> > ________________________________________
> > From: Mathieu Desnoyers [mathieu.desnoyers at efficios.com]
> > Sent: Thursday, October 23, 2014 2:57 PM
> > To: Lai Jiangshan
> > Cc: lttng-dev; Paul E. McKenney; Ben Maurer; Yannick Brosseau
> > Subject: Re: [lttng-dev] Userspace RCU: workqueue with batching, cheap
> > wakeup, and work stealing
> > 
> > The next thing I'm wondering now: should we include an
> > optional bound to the global workqueue size in the API ?
> > 
> > I've just had cases here where I stress test the queue
> > with very frequent dispatch, and it can fill up memory
> > relatively quickly if the workers have a large amount of
> > work to do per work-item.
> > 
> > I think the usual way to do this would be to make the
> > behavior nonblocking when the queue is full, so the
> > dispatcher can take action and move the work away to
> > another machine, or report congestion.
> > 
> > Thoughts ?
> > 
> > Thanks,
> > 
> > Mathieu
> > 
> > --
> > Mathieu Desnoyers
> > EfficiOS Inc.
> > http://www.efficios.com
> > 
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
> 
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list