[lttng-dev] lttng snapshots and running traces
Thibault, Daniel
Daniel.Thibault at drdc-rddc.gc.ca
Thu Sep 26 15:36:27 EDT 2013
Envoyé : 26 septembre 2013 14:33
>> Is the following scenario possible? The consumer tries to get_subbuf, and is denied access (because tracers
>> are writing into it). It then tries to get the next sub-buffer, but the luck of task scheduling is such that by the
>> time it actually calls the request routine, the tracers have already overwritten the sub-buffer in question and
>> gone on to the next one. The get_subbuf succeeds and the consumer keeps doing its thing (the timestamps
>> remain monotonically increasing, so there is no corruption). The resulting snapshot on disk has a large jump
>> in its timestamps, having lost a whole buffer cycle's worth of event records.
>>
>> To complicate things, the tracers then stop (for lack of event occurrences) in some sub-buffer ahead of the
>> consumer's current position, but still short of the consumer's stop goal. When the consumer skips over the
>> busy sub-buffer (where the tracers are "stuck"), it would read events that apparently jump back in time (a
>> whole buffer cycle's worth of time), so I guess it would stop there and close the snapshot?
>
> One important design note to understand this situation: even though we write in a ring-buffer, the positions
> are free-running counters, they "never" wrap-around (well, we are dealing with 64-bit counters). In order to
> map a position to an actual subbuffer, we use a sort of modulo. This operation gives us the actual subbuffer
> to use and also allows us to detect if the tracer has already reused this subbuffer.
Good to know, but it doesn't answer the question. :-)
If the tracers pass the consumer, and then the consumer passes the tracers, does the consumer stop copying
the buffer contents to the snapshot trace?
* consumer reads a number of "pass n" sub-buffers (where n is the number of times the tracers have gone around)
* tracers catch up with the consumer, skip ahead, write at least one whole "pass n+1" sub-buffer
* tracers then stop or slow down
* consumer reads the "pass n+1" sub-buffer(s), then overtakes the tracers
* consumer skips over the sub-buffer where the tracers are, finds a "pass n" sub-buffer
If the consumer doesn't close the trace at that point, it would corrupt the snapshot.
Daniel U. Thibault
Protection des systèmes et contremesures (PSC) | Systems Protection & Countermeasures (SPC)
Cyber sécurité pour les missions essentielles (CME) | Mission Critical Cyber Security (MCCS)
R & D pour la défense Canada - Valcartier (RDDC Valcartier) | Defence R&D Canada - Valcartier (DRDC Valcartier)
2459 route de la Bravoure
Québec QC G3J 1X5
CANADA
Vox : (418) 844-4000 x4245
Fax : (418) 844-4538
NAC : 918V QSDJ <http://www.travelgis.com/map.asp?addr=918V%20QSDJ>
Gouvernement du Canada | Government of Canada
<http://www.valcartier.drdc-rddc.gc.ca/>
More information about the lttng-dev
mailing list