[ltt-dev] Userspace RCU library relicensed to LGPLv2.1
Mathieu Desnoyers
compudj at krystal.dyndns.org
Thu May 14 16:11:03 EDT 2009
* Mathieu Desnoyers (mathieu.desnoyers at polymtl.ca) wrote:
> * Mathieu Desnoyers (mathieu.desnoyers at polymtl.ca) wrote:
> > * Steve Munroe (sjmunroe at us.ibm.com) wrote:
> > > Steven J. Munroe
> > > Linux on Power Toolchain Architect
> > > IBM Corporation, Linux Technology Center
> > >
> > >
> > > libc-alpha-owner at sourceware.org wrote on 05/14/2009 08:06:39 AM:
> > >
> > > > * Jan Blunck (jblunck at suse.de) wrote:
> > > > > On Wed, May 13, Mathieu Desnoyers wrote:
> > > > >
> > > > > > It currently supports x86 and powerpc. LGPL-compatible low-level
> > > > > > primitive headers will be required for other architectures. Note that
> > > > > > the build system is at best rudimentary at the moment.
> > > > >
> > > > > Is there a specific reason why the atomic_ops implementation was
> > > > used instead
> > > > > of the atomic builtins that come with GCC? IIRC, they are implemented
> > > on all
> > > > > architectures already.
> > > > >
> > > >
> > > > Hi Jan,
> > > >
> > > > As said Evgeniy, there is the compiler version issue, but in this case
> > > > there is more :
> > > >
> > > > If we look at
> > > > http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
> > > >
> > > > The instruction closest to an xchg() instruction (to exchange a pointer
> > > > in memory) is :
> > > >
> > > >
> > > > "type __sync_lock_test_and_set (type *ptr, type value, ...)
> > > >
> > > > This builtin, as described by Intel, is not a traditional
> > > > test-and-set operation, but rather an atomic exchange operation. It
> > > > writes value into *ptr, and returns the previous contents of *ptr.
> > > >
> > > It seems like either:
> > >
> > > bool __sync_bool_compare_and_swap (type *ptr, type oldval type newval, ...)
> > > type __sync_val_compare_and_swap (type *ptr, type oldval type newval, ...)
> > > These builtins perform an atomic compare and swap. That is, if the
> > > current value of *ptr is oldval, then write newval into *ptr.
> > >
> > >
> > > The “bool” version returns true if the comparison is successful and
> > > newval was written. The “val” version returns the contents of *ptr
> > > before the operation.
> > >
> > > would do the trick with __sync_val_compare_and_swap and simple while loop.
> > > Most of the time a single iteration is all that is required and on PowerPC
> > > is the same loop you would need for xchg().
> > >
> >
> > Even on powerpc it involves extra unneeded branches and memory barriers.
> >
> > Re-stating part of my answer to Jan Blunck, the downside of using a
> > CAS-based solution on many architectures is :
> >
> > - cache line exchanges increase (shared + exclusive access)
> > - code size increase (read, extra branches)
> > - execution speed decrease (extra branches)
> > - adds unneeded memory barriers. Release semantic is part of the
> > __sync_val_compare_and_swap primitive, and, unless there is a scenario
> > I would be missing, seems unneeded for xchg().
> >
> > While I can see this as a temporary fall-back for architectures where a
> > proper atomic primitive is not implemented, it does not strike me as a
> > neat solution.
> >
>
> To outline the direction I plan to follow for the userspace RCU library
> more clearly :
>
> I think that given we need our own primitives anyway due to :
>
> - atomic ops are not supported by old compilers
> - lack of proper xchg primitive in gcc
>
> And the simple fact that our own primitive would be better suited to our
> needs than the existing gcc primitives anyway, I think we should not
> bother trying to use gcc primitive and use our own in every case. This
> would diminish the risk of having different behaviors with the library
> being compiled with different gcc version. Using gcc primitives in some
> cases would make testing and maintainance more difficult and bring no
> gain in the end.
>
> Patches to extend the userspace RCU code-base to include (ideally)
> MIT-style or LGPLv2.1-compatible xchg() primitives and also
> arch-specific memory barriers for other architectures are very welcome.
> See arch_x86.h and arch_atomic_x86.h for the detailed list of required
> per-architecture primitives. (note that get_cycles is only needed for
> the test programs, not the library per se)
>
Note, however, that if someone sends a patch allowing selecting the gcc
atomic operations as a compile-time option, I would be inclined to merge
it, especially if it is proven useful for Linux distributions.
Mathieu
> Mathieu
>
>
>
> > Mathieu
> >
> >
> > --
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
More information about the lttng-dev
mailing list