[lttng-dev] [RFC PATCH urcu] Document uatomic operations
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Wed May 16 14:17:42 EDT 2012
* Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> On Tue, May 15, 2012 at 08:10:03AM -0400, Mathieu Desnoyers wrote:
> > * Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> > > On Mon, May 14, 2012 at 10:39:01PM -0400, Mathieu Desnoyers wrote:
> > > > Document each atomic operation provided by urcu/uatomic.h, along with
> > > > their memory barrier guarantees.
> > >
> > > Great to see the documentation!!! Some comments below.
> > >
> > > Thanx, Paul
> > >
> > > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
> > > > ---
> > > > diff --git a/doc/Makefile.am b/doc/Makefile.am
> > > > index bec1d7c..db9811c 100644
> > > > --- a/doc/Makefile.am
> > > > +++ b/doc/Makefile.am
> > > > @@ -1 +1 @@
> > > > -dist_doc_DATA = rcu-api.txt
> > > > +dist_doc_DATA = rcu-api.txt uatomic-api.txt
> > > > diff --git a/doc/uatomic-api.txt b/doc/uatomic-api.txt
> > > > new file mode 100644
> > > > index 0000000..3605acf
> > > > --- /dev/null
> > > > +++ b/doc/uatomic-api.txt
> > > > @@ -0,0 +1,80 @@
> > > > +Userspace RCU Atomic Operations API
> > > > +by Mathieu Desnoyers and Paul E. McKenney
> > > > +
> > > > +
> > > > +This document describes the <urcu/uatomic.h> API. Those are the atomic
> > > > +operations provided by the Userspace RCU library. The general rule
> > > > +regarding memory barriers is that only uatomic_xchg(),
> > > > +uatomic_cmpxchg(), uatomic_add_return(), and uatomic_sub_return() imply
> > > > +full memory barriers before and after the atomic operation. Other
> > > > +primitives don't guarantee any memory barrier.
> > > > +
> > > > +Only atomic operations performed on integers ("int" and "long", signed
> > > > +and unsigned) are supported on all architectures. Some architectures
> > > > +also support 1-byte and 2-byte atomic operations. Those respectively
> > > > +have UATOMIC_HAS_ATOMIC_BYTE and UATOMIC_HAS_ATOMIC_SHORT defined when
> > > > +uatomic.h is included. An architecture trying to perform an atomic write
> > > > +to a type size not supported by the architecture will trigger an illegal
> > > > +instruction.
> > > > +
> > > > +In the description below, "type" is a type that can be atomically
> > > > +written to by the architecture. It needs to be at most word-sized, and
> > > > +its alignment needs to greater or equal to its size.
> > > > +
> > > > +type uatomic_set(type *addr, type v)
> > > > +
> > > > + Atomically write @v into @addr.
> > >
> > > Wouldn't this instead be "void uatomic_set(type *addr, type v)"?
> >
> > Well, in that case, we'd need to change the macro. Currently,
> > _uatomic_set maps directly to:
> >
> > #define _uatomic_set(addr, v) CMM_STORE_SHARED(*(addr), (v))
> >
> > and CMM_STORE_SHARED returns v. The question becomes: should we change
> > _uatomic_set or CMM_STORE_SHARED so they don't return anything, or
> > document that they return something ?
> >
> > One thing I noticed is that linters often complain that the return value
> > of CMM_STORE_SHARED is never used. One thing we could look into is to
> > try some gcc attributes and/or linter annotations to flag this return
> > value as possibly unused. Thoughts ?
>
> Hmmm...
>
> Does the following work?
>
> #define _uatomic_set(addr, v) ((void)CMM_STORE_SHARED(*(addr), (v)))
Well, it would work, yes, but then we would not be consistent between
return values or no return values of:
uatomic_set()
rcu_assign_pointer()
rcu_set_pointer()
if you notice, in the Linux kernel, rcu_assign_pointer returns the
new pointer value. But you are right that atomic_set() does not return
anything. So which consistency would be best to keep ?
Thanks,
Mathieu
>
> > > By "Atomically write @v into @addr", what is meant is that no concurrent
> > > operation that reads from addr will see partial effects of uatomic_set(),
> > > correct? In other words, the concurrent read will either see v or
> > > the old value, not a mush of the two.
> >
> > yep. I added that clarification.
> >
> > >
> > > > +
> > > > +type uatomic_read(type *addr)
> > > > +
> > > > + Atomically read @v from @addr.
> > >
> > > Similar comments on the meaning of "atomically". This may sound picky,
> > > but people coming from an x86 environment might otherwise assume that
> > > there is lock prefix involved...
> >
> > same.
> >
> > >
> > > > +
> > > > +type uatomic_cmpxchg(type *addr, type old, type new)
> > > > +
> > > > + Atomically check if @addr contains @old. If true, then replace
> > > > + the content of @addr by @new. Return the value previously
> > > > + contained by @addr. This function imply a full memory barrier
> > > > + before and after the atomic operation.
> > >
> > > Suggest "then atomically replace" or some such. It might not hurt
> > > to add that this is an atomic read-modify-write operation.
> >
> > Updated to:
> >
> > type uatomic_cmpxchg(type *addr, type old, type new)
> >
> > An atomic read-modify-write operation that performs this
> > sequence of operations atomically: check if @addr contains @old.
> > If true, then replace the content of @addr by @new. Return the
> > value previously contained by @addr. This function imply a full
> > memory barrier before and after the atomic operation.
> >
> > >
> > > Similar comments on the other value-returning atomics.
> >
> > Will do something similar.
> >
> > >
> > > > +
> > > > +type uatomic_xchg(type *addr, type new)
> > > > +
> > > > + Atomically replace the content of @addr by @new, and return the
> > > > + value previously contained by @addr. This function imply a full
> > > > + memory barrier before and after the atomic operation.
> > > > +
> > > > +type uatomic_add_return(type *addr, type v)
> > > > +type uatomic_sub_return(type *addr, type v)
> > > > +
> > > > + Atomically increment/decrement the content of @addr by @v, and
> > > > + return the resulting value. This function imply a full memory
> > > > + barrier before and after the atomic operation.
> > > > +
> > > > +void uatomic_and(type *addr, type mask)
> > > > +void uatomic_or(type *addr, type mask)
> > > > +
> > > > + Atomically write the result of bitwise "and"/"or" between the
> > > > + content of @addr and @mask into @addr. Memory barriers are
> > > > + provided by explicitly using cmm_smp_mb__before_uatomic_and(),
> > > > + cmm_smp_mb__after_uatomic_and(),
> > > > + cmm_smp_mb__before_uatomic_or(), and
> > > > + cmm_smp_mb__after_uatomic_or().
> > >
> > > I suggest replacing "Memory barriers are provided ..." with something like
> > > "These operations do not necessarily imply memory barriers. If memory
> > > barriers are needed, they may be provided ...". Then perhaps add a
> > > sentence stating that the advantage of using the four __before_/__after_
> > > primitives is that they are no-ops on architectures in which the underlying
> > > atomic instructions implicitly supply the needed memory barriers.
> > >
> > > Simlar comments on the other non-value-returning atomic operations below.
> >
> > OK, done.
> >
> > Here is the update:
> >
> >
> > diff --git a/doc/Makefile.am b/doc/Makefile.am
> > index 27d3793..3422653 100644
> > --- a/doc/Makefile.am
> > +++ b/doc/Makefile.am
> > @@ -1 +1 @@
> > -dist_doc_DATA = rcu-api.txt cds-api.txt
> > +dist_doc_DATA = rcu-api.txt cds-api.txt uatomic-api.txt
> > diff --git a/doc/uatomic-api.txt b/doc/uatomic-api.txt
> > new file mode 100644
> > index 0000000..3ad8fbb
> > --- /dev/null
> > +++ b/doc/uatomic-api.txt
> > @@ -0,0 +1,102 @@
> > +Userspace RCU Atomic Operations API
> > +by Mathieu Desnoyers and Paul E. McKenney
> > +
> > +
> > +This document describes the <urcu/uatomic.h> API. Those are the atomic
> > +operations provided by the Userspace RCU library. The general rule
> > +regarding memory barriers is that only uatomic_xchg(),
> > +uatomic_cmpxchg(), uatomic_add_return(), and uatomic_sub_return() imply
> > +full memory barriers before and after the atomic operation. Other
> > +primitives don't guarantee any memory barrier.
> > +
> > +Only atomic operations performed on integers ("int" and "long", signed
> > +and unsigned) are supported on all architectures. Some architectures
> > +also support 1-byte and 2-byte atomic operations. Those respectively
> > +have UATOMIC_HAS_ATOMIC_BYTE and UATOMIC_HAS_ATOMIC_SHORT defined when
> > +uatomic.h is included. An architecture trying to perform an atomic write
> > +to a type size not supported by the architecture will trigger an illegal
> > +instruction.
> > +
> > +In the description below, "type" is a type that can be atomically
> > +written to by the architecture. It needs to be at most word-sized, and
> > +its alignment needs to greater or equal to its size.
> > +
> > +type uatomic_set(type *addr, type v)
> > +
> > + Atomically write @v into @addr. By "atomically", we mean that no
> > + concurrent operation that reads from addr will see partial
> > + effects of uatomic_set().
>
> Good clarification!
>
> If this is to return the value written, it should say so. (I would prefer
> no return, as this is the most common usage and the value written is easily
> available anyway.)
>
> > +type uatomic_read(type *addr)
> > +
> > + Atomically read @v from @addr. By "atomically", we mean that
> > + uatomic_read() cannot see a partial effect of any concurrent
> > + uatomic update.
> > +
> > +type uatomic_cmpxchg(type *addr, type old, type new)
> > +
> > + An atomic read-modify-write operation that performs this
> > + sequence of operations atomically: check if @addr contains @old.
> > + If true, then replace the content of @addr by @new. Return the
> > + value previously contained by @addr. This function imply a full
> > + memory barrier before and after the atomic operation.
>
> Good clarification! (And below as well.)
>
> > +type uatomic_xchg(type *addr, type new)
> > +
> > + An atomic read-modify-write operation that performs this sequence
> > + of operations atomically: replace the content of @addr by @new,
> > + and return the value previously contained by @addr. This
> > + function imply a full memory barrier before and after the atomic
> > + operation.
> > +
> > +type uatomic_add_return(type *addr, type v)
> > +type uatomic_sub_return(type *addr, type v)
> > +
> > + An atomic read-modify-write operation that performs this
> > + sequence of operations atomically: increment/decrement the
> > + content of @addr by @v, and return the resulting value. This
> > + function imply a full memory barrier before and after the atomic
> > + operation.
> > +
> > +void uatomic_and(type *addr, type mask)
> > +void uatomic_or(type *addr, type mask)
> > +
> > + Atomically write the result of bitwise "and"/"or" between the
> > + content of @addr and @mask into @addr.
> > + These operations do not necessarily imply memory barriers.
> > + If memory barriers are needed, they may be provided by
> > + explicitly using
> > + cmm_smp_mb__before_uatomic_and(),
> > + cmm_smp_mb__after_uatomic_and(),
> > + cmm_smp_mb__before_uatomic_or(), and
> > + cmm_smp_mb__after_uatomic_or(). These explicit barriers are
> > + np-ops on architectures in which the underlying atomic
> > + instructions implicitly supply the needed memory barriers.
>
> Good as well for these!
>
> Thanx, Paul
>
> > +void uatomic_add(type *addr, type v)
> > +void uatomic_sub(type *addr, type v)
> > +
> > + Atomically increment/decrement the content of @addr by @v.
> > + These operations do not necessarily imply memory barriers.
> > + If memory barriers are needed, they may be provided by
> > + explicitly using
> > + cmm_smp_mb__before_uatomic_add(),
> > + cmm_smp_mb__after_uatomic_add(),
> > + cmm_smp_mb__before_uatomic_sub(), and
> > + cmm_smp_mb__after_uatomic_sub(). These explicit barriers are
> > + np-ops on architectures in which the underlying atomic
> > + instructions implicitly supply the needed memory barriers.
> > +
> > +void uatomic_inc(type *addr)
> > +void uatomic_dec(type *addr)
> > +
> > + Atomically increment/decrement the content of @addr by 1.
> > + These operations do not necessarily imply memory barriers.
> > + If memory barriers are needed, they may be provided by
> > + explicitly using
> > + cmm_smp_mb__before_uatomic_inc(),
> > + cmm_smp_mb__after_uatomic_inc(),
> > + cmm_smp_mb__before_uatomic_dec(), and
> > + cmm_smp_mb__after_uatomic_dec(). These explicit barriers are
> > + np-ops on architectures in which the underlying atomic
> > + instructions implicitly supply the needed memory barriers.
> >
> >
> > --
> > Mathieu Desnoyers
> > Operating System Efficiency R&D Consultant
> > EfficiOS Inc.
> > http://www.efficios.com
> >
>
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list