[lttng-dev] Fast way to clear trace logs and continue tracing?

Douglas Graham douglas.graham at ericsson.com
Wed Dec 14 19:47:57 UTC 2016


Thanks for your reply Jonathan,

I don't remember ever having this problem before.  I posted my question
to lttng-dev without being subscribed, so I did not get your reply in an email.
I'm pasting it from the archives into an email this time, but I've just sent a
Subscription request so next time hopefully I won't have this problem.

>> Suppose I have configured and started a session and have collected
>> a bunch of traces in a trace file.  Now I want to clear those
>> traces but then continue saving new traces using the same session.
>> Is there a reasonably fast way to do that?
>
> Normally the typical use case is the other way around, you work in
> flight-recorder mode (not touching disk or network) and take "snapshot"
> when needed.
>
> Could you explain what you want to achieve? What is the bigger picture?

We are using lttng as the backend to a legacy logging system that
previously ran on an RTOS. Application code generates logs by calling
routines that now map to lttng tracepoints. The commands that have
historically been used to view and clear these logs, and which we want
to remain the same, are:

   log read
   log clear

It's that "log clear" command that is problematic with lttng.
It gets used a lot in our automated testing where we want to view
only the logs generated by a particular test. eg:

 while have more tests to run
   run test
   log read  # the test engine captures the output from this command
   log clear

If we can't clear the logs between tests, then we have to dump
the entire huge logfile after each test, and that can be slow and
can cause other issues as well.

>> What we currently do is "lttng save --output-path=x.cfg; lttng stop;
>>        lttng destroy; rm -rf <tracedir>; lttng load --input-path=x.cfg

> Are you saving/loading multiple sessions ? Probably not since you use
> "lttng stop" and "lttng destroy directly".

All of these operations are performed on only one session,
"xcs_trace". The session name is passed to the save, stop, and
destroy commands.  I left if off to try to keep things brief (which
might have been a bad idea). The contents of the xcs_trace.lttng
file generated by the save command are at the end of this email.

> You systematically remove the traces ? When does this get executed
> (cron, user action, trace size)?
> When does the trace data become important? When do you want to keep the
> trace data?

Hopefully, I've answered this above.  Our test engine use "log read" and "log clear"
a lot, but these are also frequently used manually.  For example, a developer might
boot up the system, which generates a lot of logs, but then clear those bootup logs
so that any subsequent logs generated by manual testing are easier to get to without
scrolling through thousands of lines of bootup logs.

> Is it a requirement that the new session have the same name and/or the
> same output path?

Hmmm. I think the only real requirement is that any newly generated
logs become visible via subsequent "log read" commands.  If we
can do that with a different session name and/or output path,
that would probably be sufficient.

>> But this is very slow in some cases.  The lttng load in particular is
>> extremely slow when Linux is running inside a virtual machine.
>
> I will test that. Not sure what would cause the load command to be slow
> inside a vm particularly.
> The " lttng stop" and "rm -rf" are most probably taking more time than
> the load command.

It's definitely the load command that is the killer in our
environment.  I'd send you output from the time command, except that
time inside our VM is pretty much meaningless because it's an event
driven simulation where time runs fast when there is little to do
and slow when there is lots to do.  This is an ARM instruction set
simulator that does run fairly slowly on our x86_64 machines, but the
"log clear" command stands out as running considerably more slowly
than other commands, especially given that conceptually at least,
it's not doing very much.  Overall, our existing implementation of
"log clear" takes about 25 seconds, and 20 seconds of that (give
or take) is from the lttng load command.

> Could you give us more information regarding the configuration you are
> loading? Number of channels, enabled events, type of session (live, to
> disk, streaming).

I hope the answers are all contained within the xcs_trace.lttng file included below.

> So far I do not see any dumb questions. :P

That's a relief :-)

Here's the contents of xcs_trace.lttng, the file saved by the "lttng save"
command:

======== begin xcs_trace.lttng ========
<?xml version="1.0" encoding="UTF-8"?>
<sessions>
    <session>
        <name>xcs_trace</name>
        <shared_memory_path>/var/log/lttng/shm/current/xcs_trace-20000101-000000</shared_memory_path>
        <domains>
            <domain>
                <type>UST</type>
                <buffer_type>PER_UID</buffer_type>
                <channels>
                    <channel>
                        <name>ch1</name>
                        <enabled>true</enabled>
                        <overwrite_mode>OVERWRITE</overwrite_mode>
                        <subbuffer_size>131072</subbuffer_size>
                        <subbuffer_count>16</subbuffer_count>
                        <switch_timer_interval>0</switch_timer_interval>
                        <read_timer_interval>0</read_timer_interval>
                        <output_type>MMAP</output_type>
                        <tracefile_size>0</tracefile_size>
                        <tracefile_count>0</tracefile_count>
                        <live_timer_interval>0</live_timer_interval>
                        <events>
                            <event>
                                <name>*</name>
                                <enabled>true</enabled>
                                <type>TRACEPOINT</type>
                                <loglevel_type>RANGE</loglevel_type>
                                <loglevel>6</loglevel>
                            </event>
                            <event>
                                <name>com_ericsson_tri*</name>
                                <enabled>true</enabled>
                                <type>TRACEPOINT</type>
                                <loglevel_type>RANGE</loglevel_type>
                                <loglevel>14</loglevel>
                            </event>
                        </events>
                        <contexts/>
                    </channel>
                </channels>
                <trackers/>
            </domain>
            <domain>
                <type>JUL</type>
                <buffer_type>PER_UID</buffer_type>
                <channels/>
            </domain>
            <domain>
                <type>LOG4J</type>
                <buffer_type>PER_UID</buffer_type>
                <channels/>
            </domain>
            <domain>
                <type>PYTHON</type>
                <buffer_type>PER_UID</buffer_type>
                <channels/>
            </domain>
        </domains>
        <started>true</started>
        <attributes>
            <snapshot_mode>true</snapshot_mode>
        </attributes>
        <output>
            <snapshot_outputs>
                <output>
                    <name>snapshot-1</name>
                    <max_size>0</max_size>
                    <consumer_output>
                        <enabled>true</enabled>
                        <destination>
                            <path>/pramfs/log/lttng/snapshot</path>
                        </destination>
                    </consumer_output>
                </output>
            </snapshot_outputs>
        </output>
    </session>
</sessions>
========== end xcs_trace.lttng ==============

Regards,
Doug




More information about the lttng-dev mailing list