[lttng-dev] lttng snapshots and running traces

Julien Desfossez jdesfossez at efficios.com
Thu Sep 26 11:18:19 EDT 2013


On 13-09-25 05:18 PM, Thibault, Daniel wrote:
>    How does the 'lttng snapshot record' command affect a running (active) trace?  I presume very little, only to the extent that the consumer daemon servicing the request "steals" CPU cycles from the session daemon busily shoving records into the buffer.
Correct, but it is not the session daemon that is responsible for
writing records into the buffers, it is the tracer (either in the
application or in the kernel).

>    Say we have a trace running in flight recorder mode and a heavy flow of events into its buffers.  When the 'lttng snapshot record' command is issued, a consumer starts at the earliest (oldest) sub-buffer and starts dumping the records to a trace directory.  As its "cursor" advances around the ring of buffers, tracing continues.  By the time it wraps around the ring, the tracer may very well have re-used a number of sub-buffers, so the consumer keeps going, trying to catch up.  Am I right in supposing that if the reading and writing speeds are well matched, this could theoretically go on nearly forever, generating a huge trace?  (That's what the snapshot --max-size option is for, along with the enable-channel --tracefile-size and --tracefile-count options)
When we issue the "snapshot record" command, the consumer takes the
current reading and writing positions in the ring-buffer, and will only
try to consume the data between these boundaries.
It will start at the reading position and consume subbuffer by subbuffer
up to the writing position. Unlike in a normal tracing session, the
reading position is not pushed by the consumer, it is pushed by the
tracer, it corresponds to the last subbuffer not yet reused (or 0 before
the first ring-buffer wrap-up).
If the tracer is quickly filling up subbuffers and we don't have enough
subbuffers (or they are too small) to give us enough time to extract
them, they will be overridden (default mode in snapshot mode) and we
will skip these subbuffers.
So there is no race between the consumer and tracer, we rely on the
absolute position (free-running counter) and we detect and skip if a
subbuffer has been reused.

The --max-size option allows to limit the size of the snapshot if the
"gap" between the reading and writing position is too big.

I hope it clarifies the situation,


> Daniel U. Thibault
> Protection des systèmes et contremesures (PSC) | Systems Protection & Countermeasures (SPC)
> Cyber sécurité pour les missions essentielles (CME) | Mission Critical Cyber Security (MCCS)
> R & D pour la défense Canada - Valcartier (RDDC Valcartier) | Defence R&D Canada - Valcartier (DRDC Valcartier)
> 2459 route de la Bravoure
> Québec QC  G3J 1X5
> Vox : (418) 844-4000 x4245
> Fax : (418) 844-4538
> NAC : 918V QSDJ <http://www.travelgis.com/map.asp?addr=918V%20QSDJ>
> Gouvernement du Canada | Government of Canada
> <http://www.valcartier.drdc-rddc.gc.ca/>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

More information about the lttng-dev mailing list