[lttng-dev] [RFC] perf sampling library for LTTng-UST

Milian Wolff milian.wolff at kdab.com
Fri Nov 18 16:36:34 UTC 2016

On Friday, September 23, 2016 2:43:32 PM CET Francis Giraldeau wrote:
> Hello!
> I did a small shared library for profiling code using perf sampling and
> LTTng-UST:
>   https://github.com/giraldeau/lttng-ust/tree/sampling/liblttng-ust-sampling
> It works by preloading the library when executing a program. Inside the
> library constructor, a perf counter is created and samples are saved inside
> the SIGIO handler using an LTTng-UST tracepoint. The call stack is obtained
> using libunwind, and thus it works even without frame pointers and with
> unmodified executables.
> Preliminary overhead measure with the default sampling period of 1E4 for cpu
> cycles is about 8.5%, or 2.2us per event. On my machine, about 38k samples
> per second are generated. This figure is obtained when compiling libunwind
> without signal re-entrance support.
> Genevieve Bastien did a nice view in TraceCompass to load this trace and
> display the corresponding call graph view.
> http://secretaire.dorsal.polymtl.ca/~gbastien/screenshots/lttng_sampling_ca
> llstack.png
> The counter is hard coded now, but it's just a prototype to demonstrate the
> concept. I would find it very cool to see such feature in LTTng. What do
> you think?

The image times out for me, i.e. I cannot load it.

I very much like the idea, as it would easily allow to combine LTTNG and perf. 
The performance impact is pretty bad though. Can't this be done differently, 
such that you reuse whatever `perf record` uses internally? That one has a far 
smaller overhead, even when using libunwind/libdw for DWARF based unwinding 
(--call-graph dwarf).

In general, I see perf being really good for profiling, whereas LTTNG seems to 
be really good for tracing. Both somehow support the other side, but not as 
nicely. I would welcome if the two projects would start to collaborate more 

From my POV, I'd like to:

- trace most of the kernel stuff, most notably scheduler, page faults, 
syscalls, ...
- trace all UST points
- sample CPU

The latter two usually only for a single process, but sometimes multiple ones. 
LTTNG gives me the first two points, and perf gives me the latter.


Milian Wolff | milian.wolff at kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5903 bytes
Desc: not available
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20161118/98b8b0f2/attachment.bin>

More information about the lttng-dev mailing list