[ltt-dev] LTTng-UST vs SystemTap userspace tracing benchmarks
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Wed Feb 16 16:05:13 EST 2011
* Stefan Hajnoczi (stefanha at gmail.com) wrote:
> On Wed, Feb 16, 2011 at 6:50 PM, Roland McGrath <roland at redhat.com> wrote:
> > Stefan was referring to #4 in your taxonomy.
> >
> > It's indeed the case that what UST uses today is an always-there normal
> > C code sequence that loads global variables to decide whether to make
> > indirect function calls. I don't recall off hand how many layers of
> > function calls to the libust DSO and such there are in either the
> > disabled or enabled cases. At best, there is the always the overhead of
> > several instructions and at least one load in the hot code path, and the
> > i-cache pollution that goes with that.
> >
> > It's indeed the cast that what Systemtap uses today is a
> > sometimes-inserted normal breakpoint instruction, which is indeed a
> > software interrupt that requires kernel mediation. When disabled, there
> > is as close to zero overhead as you can have, being a tiny placeholder
> > instruction sequence (currently just one nop), so the runtime overhead
> > is under a cycle and the i-cache pollution is the smallest possible unit
> > (one instruction, being just one byte on x86).
>
> Thanks for the explanations everyone.
>
> I remember that DTrace also uses the software breakpoint method for
> userspace probes. I think the key reason they choose this method is
> that it is the least invasive and does not require target process
> cooperation.
Yeah, but it's slow. :)
By the way, if the target process refuses to cooperate at some point (e.g.
crash), UST still keeps a handle on the shared memory map, so we can still
extract the buffers.
Another future direction we're looking into for UST is to add the ability to do
dynamic probing in userspace with the equivalent of the "fast tracepoints"
currently available in gdb and in the kernel: whenever possible, replace
instructions with a jump rather than a breakpoint to dynamically instrument
applications.
So with LD_PRELOAD and dynamic instrumentation, we won't require much
collaboration from the application.
Thanks,
Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list