[ltt-dev] lttng development plan

Mathieu Desnoyers compudj at krystal.dyndns.org
Tue Feb 3 00:11:44 EST 2009


* Gui Jianfeng (guijianfeng at cn.fujitsu.com) wrote:
> Mathieu Desnoyers wrote:
> > Hi Gui,
> ...
> > 
> > Other nice-to-have, but not a priority :
> > 
> >   - Put back support for kernel and userspace stack dump so it can be
> >     connected to any given tracepoint.
> >   - Linux ABI for fast userspace tracing.
> >     - Then, add NPTL instrumentation (mutexes, phtreads).
> >   - Create an in-kernel event filtering module which connects on
> >     LTTng.
> 
> Hi Mathieu,
> 
> Could you elaborate these three todos a bit?
> 

Hi Gui,

Yes, I'll be happy to.

  - Put back support for kernel and userspace stack dump so it can be
    connected to any given tracepoint.

Kernel stack dump :

It was available with older LTTng versions for a few architectures. It's
basically to have the ability to walk the stack like
arch/x86/kernel/stacktrace.c does, but to output the information as
trace events rather than to send it through printk.

A good starting point is the current arch/x86/kernel/stacktrace.c
implementation (and each architecture's implementation). We should
probably hook into this code which is now much more generic than it
previously was. We should probably add the "sequence" type back into
LTTng to support events with the following field layout :

{
  u32 size;
  data_type data[size];
}

Basically, it's a variable-length array which starts by indicating its
size. We would have to reserve a format-string identifier which could
look like #a#4u#p%d#p (#a would stand for variable-length sequence). This
example would expect an integer (the number of elements) and a pointer
from C, and write :

{
  u32 size; (expressed by #4u)
  void *data[size]; (expressed by #p)
}

in the trace.

The parameters would mean :

#a : sequence
#4u : field expressing the number of elements is a u32
#p : elements are of type void *
%d : number of elements
%p : pointer to an array containing the data to record

Note that this basic type would not completely fulfill what we need,
because it would imply copying all the input data (the pointers on the
stack) to an intermediate buffer before the tracer code is called. This
kind of supplementary copy is just unacceptable for performance.

Given we don't know how many pointers we will have to save we have to
calculate the length we need with ltt_get_data_size() and then we copy
the data in the ltt_write_event_data call. Note that we have to make
this solid and deal with concurrent source data modification by padding
missing data with zeroes and by making sure we never go over the
allocated event space.

The way I figure to use the "sequence" field without requiring
intermediate copy of the data is to create a variation of the sequence
which we could call "callback sequence". Basically, in the trace data,
the layout is exactly the same as the sequence. However, the C
parameters are not the same. It would receive a function pointer and a
pointer to pass as parameter to this callback. Those would be called
from both ltt_get_data_size() (to get the sequence size) and from
ltt_write_event_data() to copy the data to the trace buffer. This could
be represented as  #A#4u#p%p%p for instance. Those would respectively
mean :

#A : callback sequence
#4u : field expressing the number of elements is a u32
#p : elements are of type void *
%p : callback function pointer
%p : data passed as parameter to the function pointer

This could all be added to ltt-serialize.c. (note that my large comment
before parse_trace_type() should be updated according to what I just
wrote here)


Userspace stack dump :

It was also available with older LTTng versions. Also
architecture-specific. It provides basic ability to save as trace events
what looks like instruction pointers on the userspace stack. It's very
useful when done at system call entry to find out what is the call stack
doing such syscall. It used the "text addresses" of the process to guess
what would be a function pointer or not. It did not support listing
instruction pointers within libraries neither. Multithreading had to be
taken care of appropriately, since thread stacks are close one to
another (and we don't know when we skip from one stack to the next). I
had a couple of heuristics for this. Supporting stacks both with and
without frame pointers is also good.

We will also need the sequence type to record this type of event.


  - Linux ABI for fast userspace tracing.
    - Then, add NPTL instrumentation (mutexes, phtreads).

We are currently proposing a design document for this. Pierre-Marc
Fournier will likely start working on this this winter. Comments,
feedback and help is welcome. Please see :

http://www.lttng.org/svn/trunk/lttv/doc/developer/ust.html

  - Create an in-kernel event filtering module which connects on
    LTTng.

Well, I think this last item would be a generalization of the filtering
module I created for ext4 and jbd2 recently. The one I did provided
basic filtering for inode and device number.

The idea would be to extend this and to connect it as a filter callback
with 
ltt_module_register(LTT_FUNCTION_RUN_FILTER, filter_callback,
                    THIS_MODULE);

This module would be called by ltt_vtrace and _ltt_specialized_trace
with this test :

                if (unlikely(!ltt_run_filter(trace, eID)))
                        continue;

for each active trace.

This filter would receive the trace information and the event ID.
We may have to add or change some parameters to this to support
filtering by fields. For the filter called from ltt_vtrace, passing :

const struct marker *mdata
and
const char *fmt, va_list *args

Should be more than enough to filter generically by
- channel name
- event name
- field name -> typed field data.

Filtering should be pre-computed as much as possible and be O(1) when
executed. Creating callbacks for each expected data type to filter will
probably be requried. A technique similar to what we have done in lttv
filter.c should be considered.

For specialized probes, it might be more difficult to do this
generically, because _ltt_specialized_trace has no knowledge of the
event fields. I guess we would have to create "specialized" filtering
callbacks for those custom trace points. It's their nature anyway.

Please ask if anything is unclear. Comments/ideas are welcome.

Mathieu

> > 
> > Please ask if you need more information on specific items.
> > 
> > Best regards, and many thanks to Fujitsu for the good work,
> > 
> > Mathieu
> > 
> > 
> > * Gui Jianfeng (guijianfeng at cn.fujitsu.com) wrote:
> >> Hi Mathieu,
> >>
> >> I'd like to know whether you have a plan or roadmap 
> >> for lttng's further developping.
> >> If you have one, would you share it?
> >>
> >> -- 
> >> Regards
> >> Gui Jianfeng
> >>
> > 
> 
> -- 
> Regards
> Gui Jianfeng
> 
> 
> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68




More information about the lttng-dev mailing list