[lttng-dev] Userspace Tracing and Backtraces

Tue Mar 10 21:47:44 EDT 2015

----- Original Message -----

> From: "Brian Robbins" <brianrob at microsoft.com>
> To: "Francis Giraldeau" <francis.giraldeau at gmail.com>
> Cc: lttng-dev at lists.lttng.org
> Sent: Monday, March 9, 2015 7:17:51 PM
> Subject: Re: [lttng-dev] Userspace Tracing and Backtraces

> Thanks Francis.

> This is what I expected to have to do. I do agree though that adding this to
> lttng-ust would be a good way to go.

> Should we end up on this path, it certainly seems like it might we worth our
> time to investigate what it would take to add it to lttng-ust. Do you know
> who is the right person to talk to about this? I’d want to make sure that
> this would not be a non-starter.

Hi! 

I'm the maintainer of LTTng-UST. I agree that adding a backtrace context 
to lttng-ust would be a very useful feature. 

Some comments inline below, 

> Thanks.

> -Brian

> From: Francis Giraldeau [mailto:francis.giraldeau at gmail.com]
> Sent: Monday, March 9, 2015 12:39 PM
> To: Brian Robbins
> Cc: lttng-dev at lists.lttng.org
> Subject: Re: [lttng-dev] Userspace Tracing and Backtraces

> 2015-03-06 13:51 GMT-05:00 Brian Robbins < brianrob at microsoft.com >:
> > Thanks Francis.
> 

> > Is it accurate to say then that the array of addresses would need to be
> > captured by app code by writing a stack walker by hand
> 

> Yes, the callstack can be recorded in userspace. You would need a tracepoint
> with a varying length field:

> TRACEPOINT_EVENT(myprovider, callstack,

> TP_ARGS(unsigned long *, buf, size_t, depth),

> TP_FIELDS(

> ctf_sequence(unsigned long, addr, buf, size_t, depth)

> )

> )

> In the app code, use libunwind [1] to get the addresses, then call the
> tracepoint:

> do_unwind() // use libunwind here

> tracepoint(myprovider, buf, depth);

> However, the unwind will be done whether or not the tracepoint is active
> (~10us-100us in steady state, so it's may become expansive if called often).
> I know there was discussion about tp_code() for such use case (some code to
> call before the tracepoint only if it is enabled). Or you can cheat:

Francis: Did you define UNW_LOCAL_ONLY before including 
the libunwind header in your benchmarks ? (see 
http://www.nongnu.org/libunwind/man/libunwind%283%29.html ) 

The seems to change performance dramatically according to the documentation. 

> if (__builtin_expect(!!(__tracepoint_myprovider___callstack.state), 0)) {

> do_unwind(...)

> tracepoint(myprovider, buf, depth);

> }

> That said, instead of having a callstack tracepoint, IMHO the best solution
> would be instead extending lttng-ust to add callstack event context (itself
> linked to libunwind). Then, recording the callstack would be simple like
> that:

> $ lttng add-context -u -t callstack

Agreed on having the backtrace as a context. The main question left is 
to figure out if we want to call libunwind from within the traced application 
execution context. 

Unfortunately, libunwind is not reentrant wrt signals. This is already 
a good argument for not calling it from within a tracepoint. I wonder 
if the authors of libunwind would be open to make it signal-reentrant 
in the future (not by disabling signals, but rather by keeping a TLS 
nesting counter, and returning an error if nested, for performance 
considerations). 

> > or using the perf capture mechanism that you describe below?
> 

> Perf is peeking at the userspace from kernel space, it's another story. I
> guess that libunwind was not ported to the kernel because it is a large
> chunk of complicated code that performs a lot of I/O and computation, while
> copying a portion of the stack is really about KISS and low runtime
> overhead.

If using libunwind does not work out, another alternative I would consider 
would be to copy the stack like perf is doing from the kernel. However, 
in the spirit of compacting trace data, I would be tempted to do the following 
if we go down that route: check each pointer-aligned address for its content. 
If it looks like a pointer to an executable memory area (library, executable, or 
JIT'd code), we keep it. Else, we zero this information (not needed). We can 
then do a RLE-alike compression on the zeroes, so we can keep the layout 
of the stack after uncompression. 

Thoughts ? 

Thanks, 

Mathieu 

> Cheers,

> Francis

> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20150311/c1a03ef8/attachment-0001.html>