[lttng-dev] Babeltrace performance issue in live-reading mode

Jonathan Rajotte-Julien jonathan.rajotte-julien at efficios.com
Wed Oct 11 15:34:26 UTC 2017


Hi,

> > > I think the root cause is that the parsing and printing process is a bit
> > slow. So i want to know if there
> > > are any method to improve the performance of this process.
> >
> > A simple experiment would be to run a bounded workload
> > to perform a quick timing evaluation of babeltrace with a streamed trace
> > (on disk) and in live mode.
> > Calculate the time (time babeltrace path_to_streamed_trace) it takes
> > babeltrace to read the on disk trace
> > and perform the same experiment with a live trace. It is important to
> > bound your experiment
> > and have mostly the same workload in both experiment.
> >
> >


> I made a simple example to illustrate the issue which i had met.
> 
> Run the application for about 60 seconds, the trace is stored on disk. The
Could you also provide the time taken by babeltrace to read the same trace (on
disk) with the -o dummy parameter?

e.g: babeltrace -o dummy /path/to/trace

This will exclude the formatting and outputting delays.

> time of reading the trace using babeltrace is about 150 seconds. The speed

This indicates that you are currently replicating a high throughput scenario.
Keep in mind that at the end of the day given that babeltrace needs to serialize a
multi cores system there will always be some delays in the serialization step of a
trace reading be it locally or in live mode. What we are looking for here are
outliers to those delays.

> of processing is much lower than generating the trace.
> 
> I think the performance of babeltrace in live mode will not greater than

It is not about what you think, it is about what is actually happening.
Give us more information on the experimentation, provide us with your actual
tests so we can validate that it indeed test correctly the scenario at hand and
then we can can go forward from there.

> reading the trace on disk. So in live mode, timestamp of the event will
> have delay (printing event with timestamp of few minutes ago), i wonder to

You provide partial answers or no answers at all to most of the questions asked.
This is problematic and does not help at all the troubleshooting of the issue.

Did you perform the same experiment in live mode?
How much time does it take in live mode to process the same trace/scenario?

> know if there are any methods to improve the performance of babeltrace in
> live mode.

The most useful thing you could do is profile, perform *concrete performance
analysis* and report your finding to the community so we can work together on
improving the situation.

The previously mentioned fixes regarding lttng-live tcp communication [1] problem
was brought up to us with a comprehensive technical report with reproducers and
metrics. We were more than happy to provide feedback, time and our expertise to
alleviate the problem reported.

If there was a "quick fix", that we were aware of, it would be
already implemented and merged. We do not have any incentives to keep such fixes to
ourselves.

I'm not saying that nothing can be done to improve the performance but our
efforts, at EfficiOS, is focused on Babeltrace 2.0 for the time being. Hence we
need comprehensive data to pursue any performance related investigation
regarding babeltrace 1.X.

Cheers

[1]
https://github.com/efficios/babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d
https://github.com/efficios/babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648

> 
> Regards,
> Liguang
> 
> 
> > This will give us a rough estimate of the disparity between both scenario.
> >
> > Cheers
> >

-- 
Jonathan Rajotte-Julien
EfficiOS


More information about the lttng-dev mailing list