[lttng-dev] Adding a simple "look here" event to the trace
Erica Bugden
ebugden at efficios.com
Wed Sep 27 07:54:02 EDT 2023
On 2023-09-08 06:56, Danter, Richard via lttng-dev wrote:
> Hi all,
>
> I am investigating an issue that takes some time to reproduce. Finding
> the right point in the logs is therefore very difficult.
>
> Since I can detect when the issue happens in the kernel I would like to
> be able to emit an event into the trace that I can then search for in
> Trace Compass of through Babeltrace. So basically a kind of flag that
> says "look here". That way I can jump right to the problem and then
> look backwards from there to see what happened just before.
>
> I have looked at the docs for how to add a trace point, but it seems
> pretty complicated. I may have missed something though, so I wonder if
> there is a trivial way to add such a flag to the log? Up to now I just
> put a printk() in which helps, but would still be nicer to have
> something directly in the log.
Hello Rich,
This is a good question! The easiest way to point directly to the
relevant part of a trace is to stop capturing trace data immediately
after the identified issue is encountered. This means you know what
you're looking for is right at the end of the trace. Stopping the trace
seems like a good fit in this scenario because you're only interested in
what happens immediately before the issue and you're able to identify
when the problem has happened.
Assuming you would like to avoid modifying the kernel code, LTTng
triggers [1] may be a good fit. Triggers allow you to associate a
condition (e.g. event X happened) with an action you would like to take
(e.g. stop tracing). When the condition is encountered, the associated
action is automatically triggered.
In this scenario we would recommend:
1. Trace in overwrite mode (flight recorder mode): Since the issue
takes a while to reproduce and only the events immediately preceding the
issue are relevant, keeping just a limited amount of the most recent
data avoids accumulating useless data volume.
2. Determine when the issue is encountered with a trigger: This will
focus the trace on the problem area.
3. When the issue is encountered, take a snapshot: This will give you
a trace that contains what is relevant. What happened immediately before
the trigger will be at the end of the trace.
In terms of defining the trigger condition, you can add a trigger [2]
that matches a kernel event type that happens as close as possible to
right after the issue is encountered and then specify additional details
for the condition using the capture descriptor [3]. Ideally, you want a
condition that will only be true when the issue is encountered to avoid
having to manually sort through the snapshots afterwards. The add
trigger man page provides several examples [4] that illustrate the
condition and action syntax.
Hope this helps!
Best,
Erica
[1] LTTng triggers - https://lttng.org/docs/v2.13/#doc-trigger
[2] Add trigger - https://lttng.org/man/1/lttng-add-trigger/v2.13/
[3] Trigger capture descriptor -
https://lttng.org/man/1/lttng-add-trigger/v2.13/#doc-capture-descr
[4] Trigger examples -
https://lttng.org/man/1/lttng-add-trigger/v2.13/#doc-examples
>
> If there isn't such a thing already, then would it be a reasonable
> enhancement request to be able to add such a feature?
>
> Thanks
> Rich
>
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
More information about the lttng-dev
mailing list