[ltt-dev] lttng development plan

Tue Mar 3 14:25:32 EST 2009

* Zhaolei (zhaolei at cn.fujitsu.com) wrote:
> Hello, Mathieu
> 
> Sorry fot trouble you, how about my design of ltt-filter?
> It no objection, I'd like to starting it.
> 
> B.R.
> Zhaolei
> 
> * Zhaolei Wrote:
> > * Mathieu Desnoyers wrote:
> >> * Zhaolei (zhaolei at cn.fujitsu.com) wrote:
> >>> * Mathieu Desnoyers Wrote:
> >>>> * Zhaolei (zhaolei at cn.fujitsu.com) wrote:
> >>>>> * Mathieu Desnoyers Wrote:
> >>>>>>   - Create an in-kernel event filtering module which connects on
> >>>>>>     LTTng.
> >>>>>>
> >>>>>> Well, I think this last item would be a generalization of the filtering
> >>>>>> module I created for ext4 and jbd2 recently. The one I did provided
> >>>>>> basic filtering for inode and device number.
> >>>>> Hello, Matuieu
> >>>>>
> >>>>> I readed filter code of ext4 and jbd2 module,
> >>>>> and I have some questions of generalization filter implementation.
> >>>>>
> >>>>>> The idea would be to extend this and to connect it as a filter callback
> >>>>>> with 
> >>>>>> ltt_module_register(LTT_FUNCTION_RUN_FILTER, filter_callback,
> >>>>>>                     THIS_MODULE);
> >>>>> Now ltt_module_register can only register one filter_callback.
> >>>>> But I think when we make one filter for all types of tracepoint, it is hard to
> >>>>> maintenance.
> >>>>>
> >>>> Not exactly. That would be one filter for all types of markers. And it
> >>>> would iterate on the marker format string to filter by field name. So I
> >>>> don't see where the maintenant burden is ?
> >>>>
> >>> Hello, Mathieu
> >>>
> >>> Thanks for your response.
> >>>
> >>> Do you means we only need one filter callback function in all lttng source,
> >>> and this callback is used for every event in one trace?
> >>>
> >>> And programmer of ltt/probes/* don't need to consider about filter function
> >>> as ext4 and jbd2?
> >>>
> >> Exactly.
> >>
> >> Mathieu
> >>
> > Hello, Mathieu
> > 
> > I'm planing to make a in-kernel-filter for lttng.
> > This is my image of function and schedule:
> > 
> > Function:
> >    Build config dir-struct in debugfs
> > 
> >    Dir can dynamic updated when add/remove a module with matkers.
> > 
> >    Data's name is a symlink to __argX, because some data is noname, or some
> >    data's name is not filename, or some data have samename, for ex:
> >      Normal: "name1 %d name2 %d" -> __arg1(name1) __arg2(name2)
> >      No name: "%d %d" -> __arg1() __arg2()
> >      Not filenane: "* %d" -> __arg1()
> >      Same name: "name %d name %d" -> __arg1(name) __arg2()
> >      
> >    __arg0 is format string, to make user easy to read:
> >    # echo [debugfs]/ltt/filter/allocate_blocks/__arg0:
> >      dev %s block %llu flags %u len %u ino %lu logical %llu goal %llu lleft %llu lright %llu pleft %llu pright %llu
> > 
> >    Dir struct is like this:
> >    [debugfs]/ltt/filter
> >        |-- allocate_blocks
> >        |   |-- __arg0
> >        |   |-- __arg1
> >        |   |-- __arg2
> >        |   |-- __arg3
> >        |   |-- __arg4
> >        |   |-- __arg5
> >        |   |-- __arg6
> >        |   |-- __arg7
> >        |   |-- __arg8
> >        |   |-- __arg9
> >        |   |-- __arg10
> >        |   |-- __arg11
> >        |   |-- dev -> __arg1
> >        |   |-- block -> __arg2
> >        |   |-- flags -> __arg3
> >        |   |-- len -> __arg4
> >        |   |-- ino -> __arg5
> >        |   |-- logical -> __arg6
> >        |   |-- goal -> __arg7
> >        |   |-- lleft -> __arg8
> >        |   |-- lright -> __arg9
> >        |   |-- pleft -> __arg10
> >        |   `-- pright -> __arg11
> >        |-- allocate_inode
> >        |   |-- __arg0
> >        |   |-- __arg1
> >        |   |-- __arg2
> >        |   |-- __arg3
> >        |   |-- __arg4
> >        |   |-- dev -> __arg1
> >        |   |-- dir -> __arg3
> >        |   |-- ino -> __arg2
> >        |   `-- mode -> __arg4
> >        |-- bio_backmerge
> >        |   `-- ...
> >        |-- bio_bounce
> >        |   `-- ...
> >        |-- bio_complete
> >        |   `-- ...
> >        |-- bio_frontmerge
> >        |   `-- ...
> >        `-- ...
> > 
> >    User can config filter by write something to following debugfs files:
> >    Ex:
> >      echo "*" > [debugfs]/ltt/filter/allocate_blocks/len # enable all len
> >      echo ">100" > [debugfs]/ltt/filter/allocate_blocks/len # enable len>100
> > 
> > Schedule:
> >   1: Make basic function work
> >   2: Make support for per-trace-filter
> >      Make dir struct from:
> >      [debugfs]/ltt/filter/[marker_name]/[arg_name]
> >      to
> >      [debugfs]/ltt/filter/[trace]/[marker_name]/[arg_name]
> >      Make and use callback when add/delete trace.
> >      Update filter control dir when add/delete trace.
> >   3: Otimize for speed(use hlist, rcu, ...)
> >   4: (Maybe) Support echo "(>0 && <10) || (>50 && <100)" > len
> >      and echo "eth*" > name
> > 
> > Do you have some suggestion?
> > 

This proposal sounds great. I look forward to see the implementation. :)
The patches from Lai I am currently merging (making marker id permanent
for the whole trace session) will be useful to you when you try to
express your filter data structures.

Mathieu

> > B.R.
> > Zhaolei
> > 
> >>> B.R.
> >>> Zhaolei
> >>>
> >>>>> Maybe it is better to make each filter for each type of event, for ex:
> >>>>> ltt-ext4-filter for ext4, ltt-jbd2-filter for jbd2, ...
> >>>>>
> >>>>> We can select to add filter code into ltt/probes/XXX_trace.c, and call
> >>>>> ltt_module_register on modile_init, or write filter code in a module alone.
> >>>>> Each way is ok, but integrate filter code into ltt/probes/XXX_trace.c seems
> >>>>> more readable.
> >>>>>
> >>>>> Another thing needs consider is call which filter on a event.
> >>>>> 1: Call every filter for one event.
> >>>>>    If one filter say "no", this event is ignore.
> >>>>>    It is simple to write, and each filter can do every thing(flexible).
> >>>>>    But this way is inefficient because it needs more cpu cycles.
> >>>>> 2: Call only filter for current event type.
> >>>>>    Filter register a event type on ltt_module_register, and only used to
> >>>>>    process that type.
> >>>>> I think 2 is better because it uses less CPU.
> >>>>>
> >>>> Yes, filters definitely have to be called only for their own event ID
> >>>> for performance reasons.
> >>>>
> >>>>>> This module would be called by ltt_vtrace and _ltt_specialized_trace
> >>>>>> with this test :
> >>>>>>
> >>>>>>                 if (unlikely(!ltt_run_filter(trace, eID)))
> >>>>>>                         continue;
> >>>>>>
> >>>>>> for each active trace.
> >>>>> I think call filter in tracepoint's callback as probe_jbd2_checkpoint is
> >>>>> more efficient than this way.
> >>>>> Consider that a event which is filtered(don't send to relay), more
> >>>>> unnecessary process is done before ltt_vtrace if we call filter in ltt_vtrace.
> >>>>>
> >>>>> So, in generalization filter:
> >>>>> 1: call filter in ltt_vtrace(_ltt_specialized_trace)
> >>>>> 2: call filter in probe_XXX_checkpoint()
> >>>>> I think 2 is better.
> >>>>>
> >>>>> Which is your opinion about this?
> >>>>>
> >>>> There is a feature of LTTng which would benefit of 1 : lttng can have
> >>>> multiple trace sessions active at once. Therefore, if one trace session
> >>>> needs a subset of events and another trace session need a different set,
> >>>> then they could each have their own filter structure associated with
> >>>> them. I think having this level of flexibility is more important than
> >>>> the few cycles we could save by doing (2). Also, we already have the
> >>>> tracepoints and markers to deactivate the event source at the kernel
> >>>> level when we need to have nearly-zero overhead.
> >>>>
> >>>> Mathieu
> >>>>
> >>>>> B.R.
> >>>>> Zhaolei
> >>>>>
> >>>>>> This filter would receive the trace information and the event ID.
> >>>>>> We may have to add or change some parameters to this to support
> >>>>>> filtering by fields. For the filter called from ltt_vtrace, passing :
> >>>>>>
> >>>>>> const struct marker *mdata
> >>>>>> and
> >>>>>> const char *fmt, va_list *args
> >>>>>>
> >>>>>> Should be more than enough to filter generically by
> >>>>>> - channel name
> >>>>>> - event name
> >>>>>> - field name -> typed field data.
> >>>>>>
> >>>>>> Filtering should be pre-computed as much as possible and be O(1) when
> >>>>>> executed. Creating callbacks for each expected data type to filter will
> >>>>>> probably be requried. A technique similar to what we have done in lttv
> >>>>>> filter.c should be considered.
> >>>>>>
> >>>>>> For specialized probes, it might be more difficult to do this
> >>>>>> generically, because _ltt_specialized_trace has no knowledge of the
> >>>>>> event fields. I guess we would have to create "specialized" filtering
> >>>>>> callbacks for those custom trace points. It's their nature anyway.
> >>>>>>
> >>>>>> Please ask if anything is unclear. Comments/ideas are welcome.
> >>>>>>
> >>>>>> Mathieu
> >>>>>>
> >>>>> _______________________________________________
> >>>>> ltt-dev mailing list
> >>>>> ltt-dev at lists.casi.polymtl.ca
> >>>>> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> >>>>>
> > 
> > 
> > 
> > _______________________________________________
> > ltt-dev mailing list
> > ltt-dev at lists.casi.polymtl.ca
> > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > 
> 
> 
> 
> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68