[ltt-dev] [URCU PATCH] atomic: provide seq_cst semantics on powerpc

Mathieu Desnoyers compudj at krystal.dyndns.org
Wed Sep 21 20:27:13 EDT 2011


* Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> On Tue, Sep 20, 2011 at 12:35:18PM -0400, Mathieu Desnoyers wrote:
> > * Paolo Bonzini (pbonzini at redhat.com) wrote:
> > > Using lwsync before a load conditional provides acq_rel semantics.  Other
> > > architectures provide sequential consistency, so replace the lwsync with
> > > sync (see also http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html).
> > 
> > Paul, I think Paolo is right here. And it seems to fit with the powerpc
> > barrier models we discussed at plumbers. Can you provide feedback on
> > this ?
> 
> The Cambridge guys have run this through their proof-of-correctness
> stuff, so I am confident in it.
> 
> However...
> 
> This ordering is weaker than that of the gcc __sync series.  To achieve
> that (if we want to, no opinion at the moment), we need to leave the
> leading lwsync and replace the trailing isync with sync.

OK, I pushed commit dabbe4f87217fc22279a02d98db4984b3187b77c that
explains this change.

Paolo: if you are interested in optimising powerpc uatomic ops further,
add, sub, inc, dec, "or", "and", basically anything that is not
xchg/add_return/cmpxchg could use versions of the cmpxchg that does not
have the lwsync/sync barriers.

And we'd really need some documentation that state this consistency
model, taking care of:

all cmm_*mb(), cmm_smp_*mb(), xchg, add_return, cmpxchg.

Thanks,

Mathieu

> 
> 							Thanx, Paul
> 
> > Thanks,
> > 
> > Mathieu
> > 
> > > ---
> > >  urcu/uatomic/ppc.h |   18 ++++++------------
> > >  1 files changed, 6 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/urcu/uatomic/ppc.h b/urcu/uatomic/ppc.h
> > > index 3eb3d63..8485f67 100644
> > > --- a/urcu/uatomic/ppc.h
> > > +++ b/urcu/uatomic/ppc.h
> > > @@ -27,12 +27,6 @@
> > >  extern "C" {
> > >  #endif 
> > >  
> > > -#ifdef __NO_LWSYNC__
> > > -#define LWSYNC_OPCODE	"sync\n"
> > > -#else
> > > -#define LWSYNC_OPCODE	"lwsync\n"
> > > -#endif
> > > -
> > >  #define ILLEGAL_INSTR	".long	0xd00d00"
> > >  
> > >  /*
> > > @@ -53,7 +47,7 @@ unsigned long _uatomic_exchange(void *addr, unsigned long val, int len)
> > >  		unsigned int result;
> > >  
> > >  		__asm__ __volatile__(
> > > -			LWSYNC_OPCODE
> > > +			"sync\n"		/* for sequential consistency */
> > >  		"1:\t"	"lwarx %0,0,%1\n"	/* load and reserve */
> > >  			"stwcx. %2,0,%1\n"	/* else store conditional */
> > >  			"bne- 1b\n"	 	/* retry if lost reservation */
> > > @@ -70,7 +64,7 @@ unsigned long _uatomic_exchange(void *addr, unsigned long val, int len)
> > >  		unsigned long result;
> > >  
> > >  		__asm__ __volatile__(
> > > -			LWSYNC_OPCODE
> > > +			"sync\n"		/* for sequential consistency */
> > >  		"1:\t"	"ldarx %0,0,%1\n"	/* load and reserve */
> > >  			"stdcx. %2,0,%1\n"	/* else store conditional */
> > >  			"bne- 1b\n"	 	/* retry if lost reservation */
> > > @@ -104,7 +98,7 @@ unsigned long _uatomic_cmpxchg(void *addr, unsigned long old,
> > >  		unsigned int old_val;
> > >  
> > >  		__asm__ __volatile__(
> > > -			LWSYNC_OPCODE
> > > +			"sync\n"		/* for sequential consistency */
> > >  		"1:\t"	"lwarx %0,0,%1\n"	/* load and reserve */
> > >  			"cmpw %0,%3\n"		/* if load is not equal to */
> > >  			"bne 2f\n"		/* old, fail */
> > > @@ -125,7 +119,7 @@ unsigned long _uatomic_cmpxchg(void *addr, unsigned long old,
> > >  		unsigned long old_val;
> > >  
> > >  		__asm__ __volatile__(
> > > -			LWSYNC_OPCODE
> > > +			"sync\n"		/* for sequential consistency */
> > >  		"1:\t"	"ldarx %0,0,%1\n"	/* load and reserve */
> > >  			"cmpd %0,%3\n"		/* if load is not equal to */
> > >  			"bne 2f\n"		/* old, fail */
> > > @@ -166,7 +160,7 @@ unsigned long _uatomic_add_return(void *addr, unsigned long val,
> > >  		unsigned int result;
> > >  
> > >  		__asm__ __volatile__(
> > > -			LWSYNC_OPCODE
> > > +			"sync\n"		/* for sequential consistency */
> > >  		"1:\t"	"lwarx %0,0,%1\n"	/* load and reserve */
> > >  			"add %0,%2,%0\n"	/* add val to value loaded */
> > >  			"stwcx. %0,0,%1\n"	/* store conditional */
> > > @@ -184,7 +178,7 @@ unsigned long _uatomic_add_return(void *addr, unsigned long val,
> > >  		unsigned long result;
> > >  
> > >  		__asm__ __volatile__(
> > > -			LWSYNC_OPCODE
> > > +			"sync\n"		/* for sequential consistency */
> > >  		"1:\t"	"ldarx %0,0,%1\n"	/* load and reserve */
> > >  			"add %0,%2,%0\n"	/* add val to value loaded */
> > >  			"stdcx. %0,0,%1\n"	/* store conditional */
> > > -- 
> > > 1.7.6
> > > 
> > > 
> > > _______________________________________________
> > > ltt-dev mailing list
> > > ltt-dev at lists.casi.polymtl.ca
> > > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > > 
> > 
> > -- 
> > Mathieu Desnoyers
> > Operating System Efficiency R&D Consultant
> > EfficiOS Inc.
> > http://www.efficios.com
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list