[ltt-dev] lttng development plan

Thu Feb 19 03:17:45 EST 2009

* Mathieu Desnoyers wrote:
> * Zhaolei (zhaolei at cn.fujitsu.com) wrote:
>> * Mathieu Desnoyers Wrote:
>>> * Zhaolei (zhaolei at cn.fujitsu.com) wrote:
>>>> * Mathieu Desnoyers Wrote:
>>>>>   - Create an in-kernel event filtering module which connects on
>>>>>     LTTng.
>>>>>
>>>>> Well, I think this last item would be a generalization of the filtering
>>>>> module I created for ext4 and jbd2 recently. The one I did provided
>>>>> basic filtering for inode and device number.
>>>> Hello, Matuieu
>>>>
>>>> I readed filter code of ext4 and jbd2 module,
>>>> and I have some questions of generalization filter implementation.
>>>>
>>>>> The idea would be to extend this and to connect it as a filter callback
>>>>> with 
>>>>> ltt_module_register(LTT_FUNCTION_RUN_FILTER, filter_callback,
>>>>>                     THIS_MODULE);
>>>> Now ltt_module_register can only register one filter_callback.
>>>> But I think when we make one filter for all types of tracepoint, it is hard to
>>>> maintenance.
>>>>
>>> Not exactly. That would be one filter for all types of markers. And it
>>> would iterate on the marker format string to filter by field name. So I
>>> don't see where the maintenant burden is ?
>>>
>> Hello, Mathieu
>>
>> Thanks for your response.
>>
>> Do you means we only need one filter callback function in all lttng source,
>> and this callback is used for every event in one trace?
>>
>> And programmer of ltt/probes/* don't need to consider about filter function
>> as ext4 and jbd2?
>>
> 
> Exactly.
> 
> Mathieu
> 
Hello, Mathieu

I'm planing to make a in-kernel-filter for lttng.
This is my image of function and schedule:

Function:
   Build config dir-struct in debugfs

   Dir can dynamic updated when add/remove a module with matkers.

   Data's name is a symlink to __argX, because some data is noname, or some
   data's name is not filename, or some data have samename, for ex:
     Normal: "name1 %d name2 %d" -> __arg1(name1) __arg2(name2)
     No name: "%d %d" -> __arg1() __arg2()
     Not filenane: "* %d" -> __arg1()
     Same name: "name %d name %d" -> __arg1(name) __arg2()

   __arg0 is format string, to make user easy to read:
   # echo [debugfs]/ltt/filter/allocate_blocks/__arg0:
     dev %s block %llu flags %u len %u ino %lu logical %llu goal %llu lleft %llu lright %llu pleft %llu pright %llu

   Dir struct is like this:
   [debugfs]/ltt/filter
       |-- allocate_blocks
       |   |-- __arg0
       |   |-- __arg1
       |   |-- __arg2
       |   |-- __arg3
       |   |-- __arg4
       |   |-- __arg5
       |   |-- __arg6
       |   |-- __arg7
       |   |-- __arg8
       |   |-- __arg9
       |   |-- __arg10
       |   |-- __arg11
       |   |-- dev -> __arg1
       |   |-- block -> __arg2
       |   |-- flags -> __arg3
       |   |-- len -> __arg4
       |   |-- ino -> __arg5
       |   |-- logical -> __arg6
       |   |-- goal -> __arg7
       |   |-- lleft -> __arg8
       |   |-- lright -> __arg9
       |   |-- pleft -> __arg10
       |   `-- pright -> __arg11
       |-- allocate_inode
       |   |-- __arg0
       |   |-- __arg1
       |   |-- __arg2
       |   |-- __arg3
       |   |-- __arg4
       |   |-- dev -> __arg1
       |   |-- dir -> __arg3
       |   |-- ino -> __arg2
       |   `-- mode -> __arg4
       |-- bio_backmerge
       |   `-- ...
       |-- bio_bounce
       |   `-- ...
       |-- bio_complete
       |   `-- ...
       |-- bio_frontmerge
       |   `-- ...
       `-- ...

   User can config filter by write something to following debugfs files:
   Ex:
     echo "*" > [debugfs]/ltt/filter/allocate_blocks/len # enable all len
     echo ">100" > [debugfs]/ltt/filter/allocate_blocks/len # enable len>100

Schedule:
  1: Make basic function work
  2: Make support for per-trace-filter
     Make dir struct from:
     [debugfs]/ltt/filter/[marker_name]/[arg_name]
     to
     [debugfs]/ltt/filter/[trace]/[marker_name]/[arg_name]
     Make and use callback when add/delete trace.
     Update filter control dir when add/delete trace.
  3: Otimize for speed(use hlist, rcu, ...)
  4: (Maybe) Support echo "(>0 && <10) || (>50 && <100)" > len
     and echo "eth*" > name

Do you have some suggestion?

B.R.
Zhaolei

>> B.R.
>> Zhaolei
>>
>>>> Maybe it is better to make each filter for each type of event, for ex:
>>>> ltt-ext4-filter for ext4, ltt-jbd2-filter for jbd2, ...
>>>>
>>>> We can select to add filter code into ltt/probes/XXX_trace.c, and call
>>>> ltt_module_register on modile_init, or write filter code in a module alone.
>>>> Each way is ok, but integrate filter code into ltt/probes/XXX_trace.c seems
>>>> more readable.
>>>>
>>>> Another thing needs consider is call which filter on a event.
>>>> 1: Call every filter for one event.
>>>>    If one filter say "no", this event is ignore.
>>>>    It is simple to write, and each filter can do every thing(flexible).
>>>>    But this way is inefficient because it needs more cpu cycles.
>>>> 2: Call only filter for current event type.
>>>>    Filter register a event type on ltt_module_register, and only used to
>>>>    process that type.
>>>> I think 2 is better because it uses less CPU.
>>>>
>>> Yes, filters definitely have to be called only for their own event ID
>>> for performance reasons.
>>>
>>>>> This module would be called by ltt_vtrace and _ltt_specialized_trace
>>>>> with this test :
>>>>>
>>>>>                 if (unlikely(!ltt_run_filter(trace, eID)))
>>>>>                         continue;
>>>>>
>>>>> for each active trace.
>>>> I think call filter in tracepoint's callback as probe_jbd2_checkpoint is
>>>> more efficient than this way.
>>>> Consider that a event which is filtered(don't send to relay), more
>>>> unnecessary process is done before ltt_vtrace if we call filter in ltt_vtrace.
>>>>
>>>> So, in generalization filter:
>>>> 1: call filter in ltt_vtrace(_ltt_specialized_trace)
>>>> 2: call filter in probe_XXX_checkpoint()
>>>> I think 2 is better.
>>>>
>>>> Which is your opinion about this?
>>>>
>>> There is a feature of LTTng which would benefit of 1 : lttng can have
>>> multiple trace sessions active at once. Therefore, if one trace session
>>> needs a subset of events and another trace session need a different set,
>>> then they could each have their own filter structure associated with
>>> them. I think having this level of flexibility is more important than
>>> the few cycles we could save by doing (2). Also, we already have the
>>> tracepoints and markers to deactivate the event source at the kernel
>>> level when we need to have nearly-zero overhead.
>>>
>>> Mathieu
>>>
>>>> B.R.
>>>> Zhaolei
>>>>
>>>>> This filter would receive the trace information and the event ID.
>>>>> We may have to add or change some parameters to this to support
>>>>> filtering by fields. For the filter called from ltt_vtrace, passing :
>>>>>
>>>>> const struct marker *mdata
>>>>> and
>>>>> const char *fmt, va_list *args
>>>>>
>>>>> Should be more than enough to filter generically by
>>>>> - channel name
>>>>> - event name
>>>>> - field name -> typed field data.
>>>>>
>>>>> Filtering should be pre-computed as much as possible and be O(1) when
>>>>> executed. Creating callbacks for each expected data type to filter will
>>>>> probably be requried. A technique similar to what we have done in lttv
>>>>> filter.c should be considered.
>>>>>
>>>>> For specialized probes, it might be more difficult to do this
>>>>> generically, because _ltt_specialized_trace has no knowledge of the
>>>>> event fields. I guess we would have to create "specialized" filtering
>>>>> callbacks for those custom trace points. It's their nature anyway.
>>>>>
>>>>> Please ask if anything is unclear. Comments/ideas are welcome.
>>>>>
>>>>> Mathieu
>>>>>
>>>> _______________________________________________
>>>> ltt-dev mailing list
>>>> ltt-dev at lists.casi.polymtl.ca
>>>> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>>>>
>>
>