[lttng-dev] Babeltrace performance issue in live-reading mode

liguang li liguang.lee6 at gmail.com
Thu Sep 21 09:34:34 UTC 2017


Hi,

On Wed, Sep 20, 2017 at 10:58 PM, Jonathan Rajotte Julien <
Jonathan.rajotte-julien at efficios.com> wrote:

> Hi,
>
> On 2017-09-20 05:12 AM, liguang li wrote:
> >
> >
> > On Tue, Sep 19, 2017 at 10:57 PM, Jonathan Rajotte-Julien <
> jonathan.rajotte-julien at efficios.com <mailto:jonathan.rajotte-
> julien at efficios.com>> wrote:
> >
> >     On Tue, Sep 19, 2017 at 03:53:27PM +0800, liguang li wrote:
> >     >    Hi,
> >     >    On Mon, Sep 18, 2017 at 11:18 PM, Jonathan Rajotte-Julien
> >     >    <[1]jonathan.rajotte-julien at efficios.com <mailto:
> jonathan.rajotte-julien at efficios.com>> wrote:
> >     >
> >     >      Hi,
> >     >      On Mon, Sep 18, 2017 at 11:32:07AM +0800, liguang li wrote:
> >     >      >    Hi,
> >     >      >
> >     >      >    Create a session in live-reading mode, run a application
> which
> >     >      having very
> >     >      >    high event throughput, then prints
> >     >      >    the events with babeltrace. We found the live trace
> viewer are
> >     >      viewing
> >     >      >    events a few seconds ago, and as time
> >     >
> >     >      Could you provide us the version used for babeltrace,
> lttng-tools and
> >     >      lttng-ust?
> >     >
> >     >     Babeltrace: 1.5.1
> >
> >     Update to babeltrace 1.5.3.
> >
> >     >     Lttng-tools: 2.8.6
> >
> >     Update to lttng-tools 2.8.8
> >
> >     >     Lttng-ust: 2.8.2
> >     >
> >     >      >    goes on, the delay will be bigger and bigger.
> >     >
> >     >      A similar issues was observed a couple months ago, which
> implicated
> >     >      multiple delayed ack
> >     >      problems during communication between lttng-relayd and
> babeltrace.
> >     >
> >     >      The following fixes were merged:
> >     >
> >     >      [1]
> >     >      [2]https://github.com/lttng/lttng-tools/commit/
> b6025e9476332b75eb8184345c3eb3e924780088 <https://github.com/lttng/
> lttng-tools/commit/b6025e9476332b75eb8184345c3eb3e924780088>
> >     >      [2]
> >     >      [3]https://github.com/efficios/babeltrace/commit/
> de417d04317ca3bc30f59685a9d19de670e4b11d <https://github.com/efficios/
> babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d>
> >     >      [3]
> >     >      [4]https://github.com/efficios/babeltrace/commit/
> 4594dbd8f7c2af2446a3e310bee74ba4a2e9d648 <https://github.com/efficios/
> babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648>
> >     >
> >     >      In the event that you are already using an updated version of
> babeltrace
> >     >      and
> >     >      lttng-tools, it would be pertinent to provide us with a simple
> >     >      reproducer so we
> >     >      can assess the issue.
> >
> >     The version you are using does not include the mentioned fixes.
> Please update
> >     and redo your experiment.
> >
> >
> > Test this issue in the version you have listed, the issue still exists.
>
> Given that previous versions had a major timing problem there I would
> expect to have some improvement.
>
> In that case, we will need a lot more information on your benchmarking
> strategy.
> We will need a simple reproducer, your benchmark code (r,gnuplot etc.),
> your overall methodology
> to be able to reproduce the issue locally. Otherwise, it will be very hard
> to come to any conclusion.
>
>
Sorry, we can not provide the detailed information about the benchmark code
due to the company regulations.


> >
> >
> >     Cheers
> >
> >     >
> >     >
> >     >     Steps:
> >     >     lttng create session --live -U net://*
> >     >     lttng enable-channel -s session -u ch1
> >     >     lttng enable-event -s session -c ch1 -u -a
> >     >     lttng start
> >     >
> >     >     Run a high event throughput application, which is multithreaded
> >     >    application>
> >
> > In the multithreaded application, each tracepoint will have the wall
> > time of the system,then we can easily reproduce this issue through
> > comparing the time of recorded event and the system wall time.
> >
> >
> >     >     babeltrace -i lttng-live net://*
> >     >
> >     >     After a while, we found the timestamp of the event in the
> babeltrace is
> >     >    different with the time in host
> >     >     which run the application. And the delay will be bigger and
> bigger with
> >     >    time goes.
> >     >
> >     >
> >     >      Cheers
> >     >      >    I checked the source code, found Babeltrace in
> live-reading mode
> >     >      will read
> >     >      >    the recorded events in the CTF
> >     >      >    files, and then parse and print it in a single thread.
> The process
> >     >      is a
> >     >      >    little slow, do you have any ideas to
> >     >      >    improve the process.
> >
> >
> > From my understanding of the source code, the process of parse and
> > print event will consume a lot of time. For example, the multithreaded
> > application will consume 3 CPUs, in a specified time,3 subbuffers will
> > be filled and sent to lttng-relayd daemon, recorded into the CTF files.
> > If in the specified time, babeltrace only handled 2 subbuffers' event,
> > thenthe issue will happens.
>
> Did you perform a bisection for where the delay come from? Reception of
> packet? formatting of event?
> What is the throughput of the application?
> How many tracepoint definition?
> Does babeltrace catch up if a quiescent period is given?
>

I think so.


> Could you provide us with statistics, timing data, etc.?
>
What type of delay are we talking about?
>
>
For example, the system wall time is 5:18 PM now, but the Babeltrace is
still printing events which
have time 5:10 PM. Then stop running the application, the Babeltrace still
need some times to print
the recorded events.

I think the root cause is that the parsing and printing process is a bit
slow. So i want to know if there
are any method to improve the performance of this process.

Regards,
Liguang

--
> Jonathan R. Julien
> Efficios
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20170921/5b1a83fc/attachment.html>


More information about the lttng-dev mailing list