[ltt-dev] LTTng specialized probes

Thu Oct 9 12:55:50 EDT 2008

* Martin Bligh (mbligh at google.com) wrote:
> > Yes, there is the problem of separation of the two layers, but there is
> > more than that : if we use a separate event to write the extended TSC
> > value, we will have to do this when writing an event in a lockless
> > scheme :
> >
> > - Do
> >  - read write offset
> >  - Read TSC, compare to per-buffer "last_tsc"
> >  - If 27-bit overflow, write large TSC event
> >    - Do
> >      - read write offset
> >      - read TSC
> >      - compute event size
> >    - while cmpxchg write offset fails
> >    - write per-buffer "last-tsc"
> >    - continue (restart loop)
> >  - compute event size
> > - while cmpxchg write offset fails
> > - write per-buffer "last-tsc"
> >
> > By doing this, we actually have to do 3 TSC reads when we detect an
> > overflow intead of a single one.
> 
> OK. TSC read is cheap
> 
> > Also, we open the window for an
> > infinite loop if for some weird reason the TSC read becomes slower than
> > a 27-bits overflow (with virtualization you never know what will bite
> > you...).
> 
> Umm. if it takes that long to read the TSC, I'd say your machine
> is utterly useless.  I think you're worrying too much about REALLY
> obscure corner cases. Either tracing is not supported on those
> platforms, or we shift the TSC eventually by 10 bits right and forgo
> any hope of resolution (since your timing is *completely* screwed
> at this point anyway.
> 

Ok, if the fact that the algo might loop in some obscure case does not
convince you, this other argument should be considered : by separating
the large TSC event from the actual event which would suffer from TSC
overflow, we are tying the timestamping to space reservation. In a
perfect world, it would never fail, but the world being imperfect, we
might sometimes lose events and have corrupted subbuffers, which implies
that we can actually lose data. It becomes problematic if we lose the
large TSC event but not the event following it. Rather than just having
an event "lost", we end up having completely unreliable timings. This is
why making timestamping depend on space reservation is not such a good
idea.

Putting the large timestamp in the event header itself, like what I
propose below makes sure that if the event which should contain the
large TSC is lost, the following event will have to have a header
containing a large TSC. Note that it does not add any supplementary
space cost to the high-throughput scenario compared to the solution you
propose, because your solution also requires an event ID for the large
timestamp event.

Given it is as compact and more robust, I don't see where the problem
is ?

Mathieu

> > However, I like your idea of using this "ext. TSC" bit as an event bit.
> > We could do :
> >
> > (32-bits alignment)
> > 27-bits TSC
> > 5-bits event ID
> >  ID #31 reserved to specify extended event ID
> >  ID #30 reserved to specify both ext. event ID and event size
> >  ID #29 reserved to specify ext. event ID, event size and ext. TSC
> > <ext.>
> > 16-bits event ID (opt)
> > 16-bits event size (opt) size = 65535 to specify large event size
> > 64-bits TSC (opt) (aligned on sizeof (void *))
> > 32-bits large event size (opt) (aligned on 32-bits)
> > (event payload aligned on sizeof(void *))
> >
> > Which lets us put the extended TSC in the same event header by using the
> > ext. event ID in that case to encode the event ID. This would leave us
> > 29 IDs available (0-28) and would not require any 3-TSC reads algorithm
> > (which could cause an infinite loop) to deal with overflow.
> >
> > Mathieu
> >
> > --
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> >
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68