[lttng-dev] background information about LTTng timestamps
Mathieu Desnoyers
compudj at krystal.dyndns.org
Thu Jan 26 14:51:51 EST 2012
* Sébastien Barthélémy (barthelemy at crans.org) wrote:
> 2012/1/26 Mathieu Desnoyers <compudj at krystal.dyndns.org>:
[...]
> > In LTTng 2.0/LTTng-UST 2.0, the scheme differs. It does not require a
> > timer anymore, so this whole problem goes away. Let me explain the 2.0
> > scheme a bit more:
> >
> > * On 64-bit architectures:
> >
> > - we keep a per-stream 64-bit last_tsc value (see lttng-ust
> > libringbuffer/frontend_internal.h save_last_tsc() surroundings)
> > - at each event, we store the full 64-bit timestamp within this
> > variable. Before we store it, we check if it has a delta of more than
> > a single N bits (e.g. 27 bits) overflow from the previous value.
> > - if more than one N-bit overflow is detected, we use a full 64-bit
> > timestamp for the event.
> >
> > --> actually the check in the code is probably not as strict as it
> > could be; we have:
> >
> > if (caa_unlikely((tsc - v_read(config, &buf->last_tsc))
> > >> config->tsc_bits))
> > return 1;
> > else
> > return 0;
> >
> > I think I could change it to:
> >
> > if (caa_unlikely((tsc - v_read(config, &buf->last_tsc))
> > >> (config->tsc_bits + 1)))
> >
> > Because the current incarnation will require a full 64-bit timestamp
> > storage for single-bit overflows (which can be detected by the trace
> > reader). Thoughts ?
>
> I don't think it would be correct.
>
> Let assume
>
> N == tsc_bits == 2
> last_tsc == 0b0000
> tsc == 0b0101 // we had an N-bits overflow, and more than 2**N ns occured
>
> we get
>
> (tsc - last_tsc) == 0b0101
> ((tsc - last_tsc) >> tsc_bits) == 0b0001 // the current scheme
> forces a full timestamp
> ((tsc - last_tsc) >> (tsc_bits+1)) == 0b0000 // the suggested change does not
>
> With the current scheme, we store 0b000, 0b0101,
> With the suggested change, we would store 0b0000, 0b01. The time
> apparently does not rewinds,
> and there is no way babeltrace could notice.
>
> Another example
>
> N == tsc_bits == 2
> last_tsc == 0b0011
> tsc == 0b0110 // we had an N-bits overflow but less than 2**N ns occured
>
> we get
>
> (tsc - last_tsc) == 0b0011
> ((tsc - last_tsc) >> tsc_bits) == 0b0000 // the current scheme
> does not force a full timestamp
> ((tsc - last_tsc) >> (tsc_bits+1)) == 0b0000 // the suggested change
> does not either
>
> With the two schemes, we store 0b0011, 0b01, and let babeltrace detect
> the overflow.
>
> I therefore think the current implementation is correct and optimal.
Ah, yes, given we check for the overflow on the delta between last_tsc
and current tsc, we need to switch to the full 64-bit container as soon
as N-bit overflow of the _delta_ is detected. Thanks for the counter
argument. There are some moments like this when I think that last year's
me was more clever than present me. ;-) Well in fact I guess it's really
just a result of being swapped out of this parcular part of the
code-base for too long.
>
> > * On 32-bit architectures:
> >
> > - idea very similar to the 64-bit architecture case, but we cannot
> > do a fast 64-bit atomic read nor write. Therefore, we save only the
> > high-order bits that are needed to detect the overflow (we can
> > discard the N low-order bits).
>
> I think there was an error here: we discarded the 32 low-order bits instead of
> the N == 27 ones. See below.
I don't think the error is specific to the 32-bit architecture support,
but yes, you spotted the same thing I spotted about an hour ago (and
would have completed the fix and tested it were it not of the meeting I
had). I mistakenly specified "32" instead of "27" in the ring buffer
configuration. I'll try with 27 ASAP (for both kernel and UST tracers).
>
> > All this being said, I need to play with the code a little more to
> > understand what is going on in LTTng-UST.
>
> I your aae88c703374f4b1fbb8a5e7e95591bf8ce3e837 commit you changed
>
> tsc_bits from 32 to 27 in liblttng-ust/ltt-ring-buffer-client.h
>
> Would not that alone have fixed the problem?
> As explained in a previous email, I think, we were storing the bits
> 32..64 instead of 27..59 in last_tsc.
>
> Thus missing 2**5 overflows.
>
> What do you think?
I think it's almost certainly it. I'll let you know after some more
testing.
Best regards,
Mathieu
>
> -- Sébastien
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list