[lttng-dev] Relayd trace drops

Aravind HT aravind.ht at gmail.com
Thu Dec 3 07:01:52 EST 2015


Hi,

I am trying to obtain the performance characteristics of lttng with the use
of test applications. Traces are being produced on a local node and
delivered to relayd that is running on a separate node for storage.

An lttng session with the test applications producing an initial bit rate
of 10 kb/s is started and run for about 30 seconds. The starting sub-buffer
size is kept at 128 kb and sub-buf count at 4. The session is then stopped
and destroyed and traces are analyzed to see if there are any drops. This
is being done in a loop with every subsequent session having an increment
of 2 kb/s as long as there are no drops. If there are drops, I increase the
buffer size by a factor of x2 without incrementing the bit rate.

I see trace drops happening consistently with test apps producing traces at
less than 40 kb/s, it doesnt seem to help even if I started with 1mb x 4
sub-buffers.

Analysis :

I have attached the lttng_relayd , lttng_consumerd_64 logs and the entire
trace directory, hope you will be able to view it.
I have modified lttng_relayd code to dump the traces being captured in the
lttng_relayd logs along with debug info.

Each test app is producing logs in the form of  :
"TraceApp PID - 31940 THID - 31970 @threadRate - 1032 b/s appRate - 2079
b/s threadTraceNum - 9 appTraceNum - 18  sleepTime - 192120"

The test application PID, test application thread id, thread bit rate, test
app bit rate, thread trace number and application trace number s are part
of the trace. So in the above trace, the thread is producing at 1 kb/s and
the whole test app is producing at 2 kb/s.

If we look at the babeltrace out put, we see that the Trace with TraceApp
PID - 31940 appTraceNum 2 is missing , with 1, 3, 4, 5 and so on being
successfully captured.
I looked at the lttng_relayd logs and found that trace of "appTraceNum 2"
is not delivered/generated by the consumerd to the relayd in sequence with
other traces. To rule out that this is not a test application problem, you
can look at line ltttng_relayd log : 12778 and see traces from appTraceNum
- 1 to appTraceNum - 18 including the appTraceNum 2 are "re-delivered" by
the consumerd to the relayd.
Essentially, I see appTraceNum 1 through appTraceNum 18 being delivered
twice, once individually where appTraceNum 2 is missing and once as a group
at line 12778 where its present.


Request help with
1. why traces are delivered twice, is it by design or a genuine problem ?
2. how to avoid traces being dropped even though buffers are sufficiently
large enough ?


Regards,
Aravind.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20151203/4b80aa11/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lttng_relayd.log-20151203-1449134101.gz
Type: application/x-gzip
Size: 252135 bytes
Desc: not available
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20151203/4b80aa11/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lttng_consumerd_64.rar
Type: application/rar
Size: 7446727 bytes
Desc: not available
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20151203/4b80aa11/attachment-0001.rar>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: profiling_traces.tar.gz
Type: application/x-gzip
Size: 53800 bytes
Desc: not available
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20151203/4b80aa11/attachment-0003.bin>


More information about the lttng-dev mailing list