[lttng-dev] Relayd trace drops

Aravind HT aravind.ht at gmail.com
Fri Dec 4 23:06:53 EST 2015


I am using 2.6.0 .I will try to share the code that I'm using here in some
time. If there are any specific fixes that are relevant to this issue, see
if you can provide a link to them. I would ideally like to try them out
before trying a full upgrade to the latest versions.

On Fri, Dec 4, 2015 at 6:11 PM, Jérémie Galarneau <
jeremie.galarneau at efficios.com> wrote:

> Hi Aravind,
>
> Can't say I have looked at everything you sent yet, but as a
> preemptive question, which version are we talking about here? 2.6.0 or
> 2.6.1? 2.6.1 contains a lot of relay daemon fixes.
>
> Thanks,
> Jérémie
>
> On Thu, Dec 3, 2015 at 7:01 AM, Aravind HT <aravind.ht at gmail.com> wrote:
> > Hi,
> >
> > I am trying to obtain the performance characteristics of lttng with the
> use
> > of test applications. Traces are being produced on a local node and
> > delivered to relayd that is running on a separate node for storage.
> >
> > An lttng session with the test applications producing an initial bit
> rate of
> > 10 kb/s is started and run for about 30 seconds. The starting sub-buffer
> > size is kept at 128 kb and sub-buf count at 4. The session is then
> stopped
> > and destroyed and traces are analyzed to see if there are any drops.
> This is
> > being done in a loop with every subsequent session having an increment
> of 2
> > kb/s as long as there are no drops. If there are drops, I increase the
> > buffer size by a factor of x2 without incrementing the bit rate.
> >
> > I see trace drops happening consistently with test apps producing traces
> at
> > less than 40 kb/s, it doesnt seem to help even if I started with 1mb x 4
> > sub-buffers.
> >
> > Analysis :
> >
> > I have attached the lttng_relayd , lttng_consumerd_64 logs and the entire
> > trace directory, hope you will be able to view it.
> > I have modified lttng_relayd code to dump the traces being captured in
> the
> > lttng_relayd logs along with debug info.
> >
> > Each test app is producing logs in the form of  :
> > "TraceApp PID - 31940 THID - 31970 @threadRate - 1032 b/s appRate - 2079
> b/s
> > threadTraceNum - 9 appTraceNum - 18  sleepTime - 192120"
> >
> > The test application PID, test application thread id, thread bit rate,
> test
> > app bit rate, thread trace number and application trace number s are
> part of
> > the trace. So in the above trace, the thread is producing at 1 kb/s and
> the
> > whole test app is producing at 2 kb/s.
> >
> > If we look at the babeltrace out put, we see that the Trace with TraceApp
> > PID - 31940 appTraceNum 2 is missing , with 1, 3, 4, 5 and so on being
> > successfully captured.
> > I looked at the lttng_relayd logs and found that trace of "appTraceNum
> 2" is
> > not delivered/generated by the consumerd to the relayd in sequence with
> > other traces. To rule out that this is not a test application problem,
> you
> > can look at line ltttng_relayd log : 12778 and see traces from
> appTraceNum -
> > 1 to appTraceNum - 18 including the appTraceNum 2 are "re-delivered" by
> the
> > consumerd to the relayd.
> > Essentially, I see appTraceNum 1 through appTraceNum 18 being delivered
> > twice, once individually where appTraceNum 2 is missing and once as a
> group
> > at line 12778 where its present.
> >
> >
> > Request help with
> > 1. why traces are delivered twice, is it by design or a genuine problem ?
> > 2. how to avoid traces being dropped even though buffers are sufficiently
> > large enough ?
> >
> >
> > Regards,
> > Aravind.
> >
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev at lists.lttng.org
> > http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> >
>
>
>
> --
> Jérémie Galarneau
> EfficiOS Inc.
> http://www.efficios.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20151205/8b47c5aa/attachment.html>


More information about the lttng-dev mailing list