[ltt-dev] LTTng-UST vs SystemTap userspace tracing benchmarks

Wed Feb 16 15:55:45 EST 2011

On Wed, Feb 16, 2011 at 6:50 PM, Roland McGrath <roland at redhat.com> wrote:
> Stefan was referring to #4 in your taxonomy.
>
> It's indeed the case that what UST uses today is an always-there normal
> C code sequence that loads global variables to decide whether to make
> indirect function calls.  I don't recall off hand how many layers of
> function calls to the libust DSO and such there are in either the
> disabled or enabled cases.  At best, there is the always the overhead of
> several instructions and at least one load in the hot code path, and the
> i-cache pollution that goes with that.
>
> It's indeed the cast that what Systemtap uses today is a
> sometimes-inserted normal breakpoint instruction, which is indeed a
> software interrupt that requires kernel mediation.  When disabled, there
> is as close to zero overhead as you can have, being a tiny placeholder
> instruction sequence (currently just one nop), so the runtime overhead
> is under a cycle and the i-cache pollution is the smallest possible unit
> (one instruction, being just one byte on x86).

Thanks for the explanations everyone.

I remember that DTrace also uses the software breakpoint method for
userspace probes.  I think the key reason they choose this method is
that it is the least invasive and does not require target process
cooperation.

Stefan