[lttng-dev] [PATCH 0/3] rculfhash: error checking fixes
Eric Wong
normalperson at yhbt.net
Thu Jul 31 21:44:17 EDT 2014
Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
> I'm trying something with transparent union. See patch in separate
> email.
Thanks! I didn't know about the transparent union feature in GCC,
looks much nicer than what I would've done :)
> > > > * cmpxchg_double (cmpxchg16b on x86-64) so lfstack can use
> > > > a lock-free stack for single pop operations. I'm currently using
> > > > ck_stack from ConcurrencyKit, but generally prefer using the
> > > > URCU APIs and it would be great if lfstack could support this
> > > > on some arches.
> > >
> > > Is there a way to implement a fallback for architectures that don't
> > > have the double cmpxchg ?
> >
> > Unfortunately not. I have completely separate code paths, also cannot
> > support early AMD64 machines which lack cmpxchg16b.
>
> This means we would have to dynamically detect if the CPU supports
> the instruction, and fallback to a different way of doing things.
> So we would have to plan space (e.g. a union) for both the cmpxchg16b
> and the fallback, with possibly a compiler flag that would allow
> compiling out the fallback if the user really care about compactness,
> and not about portability.
Yeah, it's a bit of a pain :/
> > > My intent is that if we start doing
> > > optimizations for some architectures, the APIs can still be used
> > > as is by applications ported to other architectures (modulo a
> > > performance penality cost if unavoidable).
> >
> > Understandable. I wonder if regular cmpxchg with pointer-packing
> > for the generation counter works. I'll have to try that.
>
> Not sure I understand your idea here.
The lock-free pop in ck_stack relies on an extra generation counter
field to avoid ABA problem. I could pack that counter into the normal
head pointer of the stack since current x86-64 only uses 48-bit address
space; so I could have 20 bits for the counter (including bits for
alignment). I doubt it could be trusted on systems with many
cores/threads, though.
On the other hand, my choice of a lock-free stack already wastes cycles
spinning when the stack is empty; so I might choose something else
entirely.
Thanks for your responses!
More information about the lttng-dev
mailing list