[lttng-dev] [URCU PATCH 1/3] wfcqueue: implement concurrency-efficient queue

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Mon Oct 8 14:10:39 EDT 2012

* Paolo Bonzini (pbonzini at redhat.com) wrote:
> Il 08/10/2012 18:15, Mathieu Desnoyers ha scritto:
> > Hi Paolo,
> > 
> > We actually already have those, they are just not described in this
> > comment. I will fix this right away. By the way, you will notice the
> > wording:
> > 
> > + * Queue read operations "first" and "next", which are used by
> > + * "for_each" iterations, need to be protected against concurrent
> > + * "dequeue" and "splice" (for source queue) by the caller.
> > 
> > Being the only one iterating on a queue with local head/tail after a
> > splice operation is one way to provide mutual exclusion. Holding a lock
> > is not the only way to achieve mutual exclusion.
> Uh, I was confused by the _blocking suffix.  But when used together with
> splice you know it is not blocking---only the splice will block.

Well, in this case, the "blocking" can be understood as busy-waiting
(and actual blocking that invokes the OS happens if busy-waiting for long
periods only).

for_each iteration can indeed busy-wait: ___cds_wfcq_first_blocking()
and ___cds_wfcq_next_blocking() can both call
___cds_wfcq_node_sync_next(), which busy-waits if it encounters a NULL
next pointer that is not located in the tail node.

Seen through the use-case of splice to local queue + for_each, here is
what happens:

- splice moves the content of the queue into a "local" queue. However,
  it does _not_ issue a ___cds_wfcq_node_sync_next() on each node: no
  traversal is performed.

- then, within the for_each iteration, we perform the
  ___cds_wfcq_node_sync_next() synchronization as we iterate on the
  local queue.

This ensures that the synchronization performed for each node is only
performed lazily, only when those nodes are actually traversed. This
approach has the advantage of increasing the locality of reference, and
also to diminish the odds that this synchronization actually needs to be
executed uselessly by postponing it to the point where it is actually

Does it make more sense ?



Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.

More information about the lttng-dev mailing list