[lttng-dev] [PATCH 0/3] rculfhash: error checking fixes

Eric Wong normalperson at yhbt.net
Thu Jul 31 21:44:17 EDT 2014


Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
> I'm trying something with transparent union. See patch in separate
> email.

Thanks!  I didn't know about the transparent union feature in GCC,
looks much nicer than what I would've done :)

> > > > * cmpxchg_double (cmpxchg16b on x86-64) so lfstack can use
> > > >   a lock-free stack for single pop operations.  I'm currently using
> > > >   ck_stack from ConcurrencyKit, but generally prefer using the
> > > >   URCU APIs and it would be great if lfstack could support this
> > > >   on some arches.
> > > 
> > > Is there a way to implement a fallback for architectures that don't
> > > have the double cmpxchg ?
> > 
> > Unfortunately not.  I have completely separate code paths, also cannot
> > support early AMD64 machines which lack cmpxchg16b.
> 
> This means we would have to dynamically detect if the CPU supports
> the instruction, and fallback to a different way of doing things.
> So we would have to plan space (e.g. a union) for both the cmpxchg16b
> and the fallback, with possibly a compiler flag that would allow
> compiling out the fallback if the user really care about compactness,
> and not about portability.

Yeah, it's a bit of a pain :/

> > > My intent is that if we start doing
> > > optimizations for some architectures, the APIs can still be used
> > > as is by applications ported to other architectures (modulo a
> > > performance penality cost if unavoidable).
> > 
> > Understandable.  I wonder if regular cmpxchg with pointer-packing
> > for the generation counter works.  I'll have to try that.
> 
> Not sure I understand your idea here.

The lock-free pop in ck_stack relies on an extra generation counter
field to avoid ABA problem.  I could pack that counter into the normal
head pointer of the stack since current x86-64 only uses 48-bit address
space; so I could have 20 bits for the counter (including bits for
alignment).  I doubt it could be trusted on systems with many
cores/threads, though.

On the other hand, my choice of a lock-free stack already wastes cycles
spinning when the stack is empty; so I might choose something else
entirely.

Thanks for your responses!



More information about the lttng-dev mailing list