[lttng-dev] [RFC PATCH] wfqueue: expand API, simplify implementation, small performance boost
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Tue Aug 14 19:34:02 EDT 2012
* Mathieu Desnoyers (mathieu.desnoyers at efficios.com) wrote:
[...]
> One more thing: I think we need to add cmm_smp_read_barrier_depends()
> before returning the node pointers in dequeue, first, and next
> operations. This memory barrier would match the implicit barrier in the
> enqueue ordering write to the prior tail's next pointer with writes to
> the node content that would have been performed beforehand by the
> caller.
Please note that the only reason why I propose to use
"cmm_smp_read_barrier_depends()" before returning nodes in dequeue,
first, and next functions rather than a full cmm_smp_mb() is because I
fail to see a case where an architecture would reorder a dependent store
before the load of the pointer.
I very well imagine a use-case where we have node N within the structure
X like the scenario below, where we write to X immediately after dequeue:
CPU 0 CPU 1
store X content
(implicit full memory barrier
implied before uatomic_xchg)
enqueue X->N
dequeue X->N
(which barrier here ?)
write to X immediately.
If we assume that no reordering of a store that depends on loading a
pointer can arrive, then I think we should be good with only a
read_barrier_depends() on CPU 1. Otherwise, we might want to put a full
barrier there to ensure that we don't race with CPU 0 storing content to
X.
Thoughts ?
Thanks,
Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list