<div dir="ltr"><div><div>Hi,<br><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Sep 21, 2017 at 11:03 PM, Jonathan Rajotte Julien <span dir="ltr"><<a href="mailto:Jonathan.rajotte-julien@efficios.com" target="_blank">Jonathan.rajotte-julien@efficios.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
<span class="gmail-"><br>
On 2017-09-21 05:34 AM, liguang li wrote:<br>
> Hi,<br>
><br>
</span><span class="gmail-">> On Wed, Sep 20, 2017 at 10:58 PM, Jonathan Rajotte Julien <<a href="mailto:Jonathan.rajotte-julien@efficios.com">Jonathan.rajotte-julien@<wbr>efficios.com</a> <mailto:<a href="mailto:Jonathan.rajotte-julien@efficios.com">Jonathan.rajotte-<wbr>julien@efficios.com</a>>> wrote:<br>
><br>
> Hi,<br>
><br>
> On 2017-09-20 05:12 AM, liguang li wrote:<br>
> ><br>
> ><br>
</span><span class="gmail-">> > On Tue, Sep 19, 2017 at 10:57 PM, Jonathan Rajotte-Julien <<a href="mailto:jonathan.rajotte-julien@efficios.com">jonathan.rajotte-julien@<wbr>efficios.com</a> <mailto:<a href="mailto:jonathan.rajotte-julien@efficios.com">jonathan.rajotte-<wbr>julien@efficios.com</a>> <mailto:<a href="mailto:jonathan.rajotte-julien@efficios.com">jonathan.rajotte-<wbr>julien@efficios.com</a> <mailto:<a href="mailto:jonathan.rajotte-julien@efficios.com">jonathan.rajotte-<wbr>julien@efficios.com</a>>>> wrote:<br>
> ><br>
> > On Tue, Sep 19, 2017 at 03:53:27PM +0800, liguang li wrote:<br>
> > > Hi,<br>
> > > On Mon, Sep 18, 2017 at 11:18 PM, Jonathan Rajotte-Julien<br>
</span><span class="gmail-">> > > <[1]<a href="mailto:jonathan.rajotte-julien@efficios.com">jonathan.rajotte-julien@<wbr>efficios.com</a> <mailto:<a href="mailto:jonathan.rajotte-julien@efficios.com">jonathan.rajotte-<wbr>julien@efficios.com</a>> <mailto:<a href="mailto:jonathan.rajotte-julien@efficios.com">jonathan.rajotte-<wbr>julien@efficios.com</a> <mailto:<a href="mailto:jonathan.rajotte-julien@efficios.com">jonathan.rajotte-<wbr>julien@efficios.com</a>>>> wrote:<br>
> > ><br>
> > > Hi,<br>
> > > On Mon, Sep 18, 2017 at 11:32:07AM +0800, liguang li wrote:<br>
> > > > Hi,<br>
> > > ><br>
> > > > Create a session in live-reading mode, run a application which<br>
> > > having very<br>
> > > > high event throughput, then prints<br>
> > > > the events with babeltrace. We found the live trace viewer are<br>
> > > viewing<br>
> > > > events a few seconds ago, and as time<br>
> > ><br>
> > > Could you provide us the version used for babeltrace, lttng-tools and<br>
> > > lttng-ust?<br>
> > ><br>
> > > Babeltrace: 1.5.1<br>
> ><br>
> > Update to babeltrace 1.5.3.<br>
> ><br>
> > > Lttng-tools: 2.8.6<br>
> ><br>
> > Update to lttng-tools 2.8.8<br>
> ><br>
> > > Lttng-ust: 2.8.2<br>
> > ><br>
> > > > goes on, the delay will be bigger and bigger.<br>
> > ><br>
> > > A similar issues was observed a couple months ago, which implicated<br>
> > > multiple delayed ack<br>
> > > problems during communication between lttng-relayd and babeltrace.<br>
> > ><br>
> > > The following fixes were merged:<br>
> > ><br>
> > > [1]<br>
</span>> > > [2]<a href="https://github.com/lttng/lttng-tools/commit/b6025e9476332b75eb8184345c3eb3e924780088" rel="noreferrer" target="_blank">https://github.com/lttng/<wbr>lttng-tools/commit/<wbr>b6025e9476332b75eb8184345c3eb3<wbr>e924780088</a> <<a href="https://github.com/lttng/lttng-tools/commit/b6025e9476332b75eb8184345c3eb3e924780088" rel="noreferrer" target="_blank">https://github.com/lttng/<wbr>lttng-tools/commit/<wbr>b6025e9476332b75eb8184345c3eb3<wbr>e924780088</a>> <<a href="https://github.com/lttng/lttng-tools/commit/b6025e9476332b75eb8184345c3eb3e924780088" rel="noreferrer" target="_blank">https://github.com/lttng/<wbr>lttng-tools/commit/<wbr>b6025e9476332b75eb8184345c3eb3<wbr>e924780088</a> <<a href="https://github.com/lttng/lttng-tools/commit/b6025e9476332b75eb8184345c3eb3e924780088" rel="noreferrer" target="_blank">https://github.com/lttng/<wbr>lttng-tools/commit/<wbr>b6025e9476332b75eb8184345c3eb3<wbr>e924780088</a>>><br>
> > > [2]<br>
> > > [3]<a href="https://github.com/efficios/babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d" rel="noreferrer" target="_blank">https://github.com/<wbr>efficios/babeltrace/commit/<wbr>de417d04317ca3bc30f59685a9d19d<wbr>e670e4b11d</a> <<a href="https://github.com/efficios/babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d" rel="noreferrer" target="_blank">https://github.com/efficios/<wbr>babeltrace/commit/<wbr>de417d04317ca3bc30f59685a9d19d<wbr>e670e4b11d</a>> <<a href="https://github.com/efficios/babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d" rel="noreferrer" target="_blank">https://github.com/efficios/<wbr>babeltrace/commit/<wbr>de417d04317ca3bc30f59685a9d19d<wbr>e670e4b11d</a> <<a href="https://github.com/efficios/babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d" rel="noreferrer" target="_blank">https://github.com/efficios/<wbr>babeltrace/commit/<wbr>de417d04317ca3bc30f59685a9d19d<wbr>e670e4b11d</a>>><br>
> > > [3]<br>
> > > [4]<a href="https://github.com/efficios/babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648" rel="noreferrer" target="_blank">https://github.com/<wbr>efficios/babeltrace/commit/<wbr>4594dbd8f7c2af2446a3e310bee74b<wbr>a4a2e9d648</a> <<a href="https://github.com/efficios/babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648" rel="noreferrer" target="_blank">https://github.com/efficios/<wbr>babeltrace/commit/<wbr>4594dbd8f7c2af2446a3e310bee74b<wbr>a4a2e9d648</a>> <<a href="https://github.com/efficios/babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648" rel="noreferrer" target="_blank">https://github.com/efficios/<wbr>babeltrace/commit/<wbr>4594dbd8f7c2af2446a3e310bee74b<wbr>a4a2e9d648</a> <<a href="https://github.com/efficios/babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648" rel="noreferrer" target="_blank">https://github.com/efficios/<wbr>babeltrace/commit/<wbr>4594dbd8f7c2af2446a3e310bee74b<wbr>a4a2e9d648</a>>><br>
<div><div class="gmail-h5">> > ><br>
> > > In the event that you are already using an updated version of babeltrace<br>
> > > and<br>
> > > lttng-tools, it would be pertinent to provide us with a simple<br>
> > > reproducer so we<br>
> > > can assess the issue.<br>
> ><br>
> > The version you are using does not include the mentioned fixes. Please update<br>
> > and redo your experiment.<br>
> ><br>
> ><br>
> > Test this issue in the version you have listed, the issue still exists.<br>
><br>
> Given that previous versions had a major timing problem there I would expect to have some improvement.<br>
><br>
> In that case, we will need a lot more information on your benchmarking strategy.<br>
> We will need a simple reproducer, your benchmark code (r,gnuplot etc.), your overall methodology<br>
> to be able to reproduce the issue locally. Otherwise, it will be very hard<br>
> to come to any conclusion.<br>
><br>
><br>
> Sorry, we can not provide the detailed information about the benchmark code due to the company regulations.<br>
> <br>
><br>
> > <br>
> ><br>
> > Cheers<br>
> ><br>
> > ><br>
> > > <br>
> > > Steps:<br>
> > > lttng create session --live -U net://*<br>
> > > lttng enable-channel -s session -u ch1<br>
> > > lttng enable-event -s session -c ch1 -u -a<br>
> > > lttng start<br>
> > > <br>
> > > Run a high event throughput application, which is multithreaded<br>
> > > application><br>
> ><br>
> > In the multithreaded application, each tracepoint will have the wall<br>
> > time of the system,then we can easily reproduce this issue through<br>
> > comparing the time of recorded event and the system wall time.<br>
> > <br>
> ><br>
> > > babeltrace -i lttng-live net://*<br>
> > > <br>
> > > After a while, we found the timestamp of the event in the babeltrace is<br>
> > > different with the time in host<br>
> > > which run the application. And the delay will be bigger and bigger with<br>
> > > time goes.<br>
> > > <br>
> > ><br>
> > > Cheers<br>
> > > > I checked the source code, found Babeltrace in live-reading mode<br>
> > > will read<br>
> > > > the recorded events in the CTF<br>
> > > > files, and then parse and print it in a single thread. The process<br>
> > > is a<br>
> > > > little slow, do you have any ideas to<br>
> > > > improve the process.<br>
> ><br>
> ><br>
> > From my understanding of the source code, the process of parse and<br>
> > print event will consume a lot of time. For example, the multithreaded<br>
> > application will consume 3 CPUs, in a specified time,3 subbuffers will<br>
> > be filled and sent to lttng-relayd daemon, recorded into the CTF files.<br>
> > If in the specified time, babeltrace only handled 2 subbuffers' event,<br>
> > thenthe issue will happens.<br>
><br>
> Did you perform a bisection for where the delay come from? Reception of packet? formatting of event?<br>
<br>
?<br>
<br>
> What is the throughput of the application?<br>
<br>
<br>
</div></div>?<br>
This is a key element if anyone want to test at home. It can be expressed in event/s, event/ms.<br>
<br>
> How many tracepoint definition> Does babeltrace catch up if a quiescent period is given?<br>
<span class="gmail-">><br>
><br>
> I think so.<br>
> <br>
><br>
> Could you provide us with statistics, timing data, etc.?<br>
><br>
> What type of delay are we talking about?<br>
><br>
><br>
> For example, the system wall time is 5:18 PM now, but the Babeltrace is still printing events which<br>
> have time 5:10 PM. Then stop running the application, the Babeltrace still need some times to print<br>
> the recorded events.<br>
><br>
> I think the root cause is that the parsing and printing process is a bit slow. So i want to know if there<br>
> are any method to improve the performance of this process.<br>
<br>
</span>A simple experiment would be to run a bounded workload<br>
to perform a quick timing evaluation of babeltrace with a streamed trace (on disk) and in live mode.<br>
Calculate the time (time babeltrace path_to_streamed_trace) it takes babeltrace to read the on disk trace<br>
and perform the same experiment with a live trace. It is important to bound your experiment<br>
and have mostly the same workload in both experiment.<br>
<br></blockquote><div> </div><div>I made a simple example to illustrate the issue which i had met.<br><div><br></div><div>Run
the application for about 60 seconds, the trace is stored on disk. The
time of reading the trace using babeltrace is about 150 seconds. The
speed of processing is much lower than generating the trace. <br></div><div><br></div><div>I
think the performance of babeltrace in live mode will not greater than
reading the trace on disk. So in live mode, timestamp of the event will have delay (printing event with timestamp of few minutes ago), i wonder to know if there are any methods
to improve the performance of babeltrace in live mode.</div><div><br></div><div>Regards,</div><div>Liguang</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
This will give us a rough estimate of the disparity between both scenario.<br>
<br>
Cheers<br>
</blockquote></div><br></div></div>