[lttng-dev] Relayd trace drops

Jérémie Galarneau jeremie.galarneau at efficios.com
Fri Dec 4 07:41:18 EST 2015


Hi Aravind,

Can't say I have looked at everything you sent yet, but as a
preemptive question, which version are we talking about here? 2.6.0 or
2.6.1? 2.6.1 contains a lot of relay daemon fixes.

Thanks,
Jérémie

On Thu, Dec 3, 2015 at 7:01 AM, Aravind HT <aravind.ht at gmail.com> wrote:
> Hi,
>
> I am trying to obtain the performance characteristics of lttng with the use
> of test applications. Traces are being produced on a local node and
> delivered to relayd that is running on a separate node for storage.
>
> An lttng session with the test applications producing an initial bit rate of
> 10 kb/s is started and run for about 30 seconds. The starting sub-buffer
> size is kept at 128 kb and sub-buf count at 4. The session is then stopped
> and destroyed and traces are analyzed to see if there are any drops. This is
> being done in a loop with every subsequent session having an increment of 2
> kb/s as long as there are no drops. If there are drops, I increase the
> buffer size by a factor of x2 without incrementing the bit rate.
>
> I see trace drops happening consistently with test apps producing traces at
> less than 40 kb/s, it doesnt seem to help even if I started with 1mb x 4
> sub-buffers.
>
> Analysis :
>
> I have attached the lttng_relayd , lttng_consumerd_64 logs and the entire
> trace directory, hope you will be able to view it.
> I have modified lttng_relayd code to dump the traces being captured in the
> lttng_relayd logs along with debug info.
>
> Each test app is producing logs in the form of  :
> "TraceApp PID - 31940 THID - 31970 @threadRate - 1032 b/s appRate - 2079 b/s
> threadTraceNum - 9 appTraceNum - 18  sleepTime - 192120"
>
> The test application PID, test application thread id, thread bit rate, test
> app bit rate, thread trace number and application trace number s are part of
> the trace. So in the above trace, the thread is producing at 1 kb/s and the
> whole test app is producing at 2 kb/s.
>
> If we look at the babeltrace out put, we see that the Trace with TraceApp
> PID - 31940 appTraceNum 2 is missing , with 1, 3, 4, 5 and so on being
> successfully captured.
> I looked at the lttng_relayd logs and found that trace of "appTraceNum 2" is
> not delivered/generated by the consumerd to the relayd in sequence with
> other traces. To rule out that this is not a test application problem, you
> can look at line ltttng_relayd log : 12778 and see traces from appTraceNum -
> 1 to appTraceNum - 18 including the appTraceNum 2 are "re-delivered" by the
> consumerd to the relayd.
> Essentially, I see appTraceNum 1 through appTraceNum 18 being delivered
> twice, once individually where appTraceNum 2 is missing and once as a group
> at line 12778 where its present.
>
>
> Request help with
> 1. why traces are delivered twice, is it by design or a genuine problem ?
> 2. how to avoid traces being dropped even though buffers are sufficiently
> large enough ?
>
>
> Regards,
> Aravind.
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>



-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list