[lttng-dev] Sessions are disconnected when disk is full on remote host

Mikael Beckius mikael.beckius at ericsson.com
Wed Sep 23 05:48:54 EDT 2015


Hello!

I am trying to verify the expected behaviour in the event a machine hosting an lttng relay daemon runs out of disk space.

Currently it appears from testing and inspecting the source code that whenever the lttng relay daemon hits an error caused by the lack of disk space all failing sessions will be disconnected. It also appears that the only way to recover from this situation once disk space has been restored is to recreate all the disconnected sessions.

I have also played around with different combinations of the below lttng enable-channel options:
--discard
--overwrite
--tracefile-count <trace file count>
--tracefile-size <trace file size>

The initial channel configuration was using the options --tracefile-count --tracefile-size but both were abandoned as Babeltrace faced constant streaming errors during heavy live tracing and now the trace files are growing continuously until there is no available disk space.

I assume that --overwrite would not make any difference compared to --discard with regards to a disk full scenario. Even if you specify --tracefile-count and --tracefile-size to limit the amount of trace data from a single session you may still face the same issue with disconnected sessions if you for some reason run out of disk space.

Testing was carried out on the latest 2.6 releases and all sessions were live sessions.

So my questions are:
- Is there any way to keep sessions alive even if you run out of disk space?
- How can you clean up old traces once you have collected them without recreating sessions?

Micke
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20150923/7d07e1ef/attachment.html>


More information about the lttng-dev mailing list