[ltt-dev] Userspace RCU library relicensed to LGPLv2.1

Thu May 14 09:06:39 EDT 2009

* Jan Blunck (jblunck at suse.de) wrote:
> On Wed, May 13, Mathieu Desnoyers wrote:
> 
> > It currently supports x86 and powerpc. LGPL-compatible low-level
> > primitive headers will be required for other architectures. Note that
> > the build system is at best rudimentary at the moment.
> 
> Is there a specific reason why the atomic_ops implementation was used instead
> of the atomic builtins that come with GCC? IIRC, they are implemented on all
> architectures already.
> 

Hi Jan,

As said Evgeniy, there is the compiler version issue, but in this case
there is more :

If we look at
http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html

The instruction closest to an xchg() instruction (to exchange a pointer
in memory) is :

"type __sync_lock_test_and_set (type *ptr, type value, ...)

    This builtin, as described by Intel, is not a traditional
test-and-set operation, but rather an atomic exchange operation. It
writes value into *ptr, and returns the previous contents of *ptr.

    Many targets have only minimal support for such locks, and do not
support a full exchange operation. In this case, a target may support
reduced functionality here by which the only valid value to store is the
immediate constant 1. The exact value actually stored in *ptr is
implementation defined.

    This builtin is not a full barrier, but rather an acquire barrier.
This means that references after the builtin cannot move to (or be
speculated to) before the builtin, but previous memory stores may not be
globally visible yet, and previous memory loads may not yet be
satisfied."

The second paragraph is a concern to me. I prefer to provide my own
xchg() primitive than to use a primitive which "might" work, but might
also only accept writing the "1" immediate value, depending on the
architecture.

I use the xchg() operation for my rcu_xchg_pointer() primitive to permit
exchanging a value without extra locking when copy of the old content is
not needed. Note that rcu_publish_content() is an API which simply calls
rcu_xchg_pointer() and synchronize_rcu(). Therefore, the pointer it
returns can be safely freed. Here is an example of xchg() usefulness.

If the old value needs to be copied, we need to add mutexes to protect
the data copy and make sure we are not racing with other writers :

struct datatype *rcu_pointer;

void writer(void)
{
  struct datatype *new, *old;

  new = malloc(sizeof(new));

  pthread_mutex_lock(&somemutex);

  old = rcu_pointer;
  memcpy(new, old, sizeof(new));

  // modify new

  rcu_publish_content(&rcu_pointer, new);

  pthread_mutex_unlock(&somemutex);

  free(old);
}

But if we don't care about copying the old content (we just need to
replace it with new content), then locking is not needed :

void writer(void)
{
  struct datatype *new, *old;

  new = malloc(sizeof(new));

  // populate some data in new

  old = rcu_publish_content(&rcu_pointer, new);

  free(old);
}

The fact that xchg exchanged the old pointer with the new pointer
atomically guarantees that free(old) will be the only writer freeing
this pointer. One could keep the pointers returned by rcu_xchg_pointer()
in a "ongoing grace period" pointers pool and do a single
synchronize_rcu() call periodically (in a garbage collection thread for
instance) before we free the whole pool. This would allow fast structure
updates at the expense of some bounded amount of extra memory.

Mathieu

> Regards,
> 	Jan
> 
> -- 
> Jan Blunck <jblunck at suse.de>

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68