[ltt-dev] lttng development plan

Zhaolei zhaolei at cn.fujitsu.com
Mon Mar 2 02:01:25 EST 2009


Hello, Mathieu

Sorry fot trouble you, how about my design of ltt-filter?
It no objection, I'd like to starting it.

B.R.
Zhaolei

* Zhaolei Wrote:
> * Mathieu Desnoyers wrote:
>> * Zhaolei (zhaolei at cn.fujitsu.com) wrote:
>>> * Mathieu Desnoyers Wrote:
>>>> * Zhaolei (zhaolei at cn.fujitsu.com) wrote:
>>>>> * Mathieu Desnoyers Wrote:
>>>>>>   - Create an in-kernel event filtering module which connects on
>>>>>>     LTTng.
>>>>>>
>>>>>> Well, I think this last item would be a generalization of the filtering
>>>>>> module I created for ext4 and jbd2 recently. The one I did provided
>>>>>> basic filtering for inode and device number.
>>>>> Hello, Matuieu
>>>>>
>>>>> I readed filter code of ext4 and jbd2 module,
>>>>> and I have some questions of generalization filter implementation.
>>>>>
>>>>>> The idea would be to extend this and to connect it as a filter callback
>>>>>> with 
>>>>>> ltt_module_register(LTT_FUNCTION_RUN_FILTER, filter_callback,
>>>>>>                     THIS_MODULE);
>>>>> Now ltt_module_register can only register one filter_callback.
>>>>> But I think when we make one filter for all types of tracepoint, it is hard to
>>>>> maintenance.
>>>>>
>>>> Not exactly. That would be one filter for all types of markers. And it
>>>> would iterate on the marker format string to filter by field name. So I
>>>> don't see where the maintenant burden is ?
>>>>
>>> Hello, Mathieu
>>>
>>> Thanks for your response.
>>>
>>> Do you means we only need one filter callback function in all lttng source,
>>> and this callback is used for every event in one trace?
>>>
>>> And programmer of ltt/probes/* don't need to consider about filter function
>>> as ext4 and jbd2?
>>>
>> Exactly.
>>
>> Mathieu
>>
> Hello, Mathieu
> 
> I'm planing to make a in-kernel-filter for lttng.
> This is my image of function and schedule:
> 
> Function:
>    Build config dir-struct in debugfs
> 
>    Dir can dynamic updated when add/remove a module with matkers.
> 
>    Data's name is a symlink to __argX, because some data is noname, or some
>    data's name is not filename, or some data have samename, for ex:
>      Normal: "name1 %d name2 %d" -> __arg1(name1) __arg2(name2)
>      No name: "%d %d" -> __arg1() __arg2()
>      Not filenane: "* %d" -> __arg1()
>      Same name: "name %d name %d" -> __arg1(name) __arg2()
>      
>    __arg0 is format string, to make user easy to read:
>    # echo [debugfs]/ltt/filter/allocate_blocks/__arg0:
>      dev %s block %llu flags %u len %u ino %lu logical %llu goal %llu lleft %llu lright %llu pleft %llu pright %llu
> 
>    Dir struct is like this:
>    [debugfs]/ltt/filter
>        |-- allocate_blocks
>        |   |-- __arg0
>        |   |-- __arg1
>        |   |-- __arg2
>        |   |-- __arg3
>        |   |-- __arg4
>        |   |-- __arg5
>        |   |-- __arg6
>        |   |-- __arg7
>        |   |-- __arg8
>        |   |-- __arg9
>        |   |-- __arg10
>        |   |-- __arg11
>        |   |-- dev -> __arg1
>        |   |-- block -> __arg2
>        |   |-- flags -> __arg3
>        |   |-- len -> __arg4
>        |   |-- ino -> __arg5
>        |   |-- logical -> __arg6
>        |   |-- goal -> __arg7
>        |   |-- lleft -> __arg8
>        |   |-- lright -> __arg9
>        |   |-- pleft -> __arg10
>        |   `-- pright -> __arg11
>        |-- allocate_inode
>        |   |-- __arg0
>        |   |-- __arg1
>        |   |-- __arg2
>        |   |-- __arg3
>        |   |-- __arg4
>        |   |-- dev -> __arg1
>        |   |-- dir -> __arg3
>        |   |-- ino -> __arg2
>        |   `-- mode -> __arg4
>        |-- bio_backmerge
>        |   `-- ...
>        |-- bio_bounce
>        |   `-- ...
>        |-- bio_complete
>        |   `-- ...
>        |-- bio_frontmerge
>        |   `-- ...
>        `-- ...
> 
>    User can config filter by write something to following debugfs files:
>    Ex:
>      echo "*" > [debugfs]/ltt/filter/allocate_blocks/len # enable all len
>      echo ">100" > [debugfs]/ltt/filter/allocate_blocks/len # enable len>100
> 
> Schedule:
>   1: Make basic function work
>   2: Make support for per-trace-filter
>      Make dir struct from:
>      [debugfs]/ltt/filter/[marker_name]/[arg_name]
>      to
>      [debugfs]/ltt/filter/[trace]/[marker_name]/[arg_name]
>      Make and use callback when add/delete trace.
>      Update filter control dir when add/delete trace.
>   3: Otimize for speed(use hlist, rcu, ...)
>   4: (Maybe) Support echo "(>0 && <10) || (>50 && <100)" > len
>      and echo "eth*" > name
> 
> Do you have some suggestion?
> 
> B.R.
> Zhaolei
> 
>>> B.R.
>>> Zhaolei
>>>
>>>>> Maybe it is better to make each filter for each type of event, for ex:
>>>>> ltt-ext4-filter for ext4, ltt-jbd2-filter for jbd2, ...
>>>>>
>>>>> We can select to add filter code into ltt/probes/XXX_trace.c, and call
>>>>> ltt_module_register on modile_init, or write filter code in a module alone.
>>>>> Each way is ok, but integrate filter code into ltt/probes/XXX_trace.c seems
>>>>> more readable.
>>>>>
>>>>> Another thing needs consider is call which filter on a event.
>>>>> 1: Call every filter for one event.
>>>>>    If one filter say "no", this event is ignore.
>>>>>    It is simple to write, and each filter can do every thing(flexible).
>>>>>    But this way is inefficient because it needs more cpu cycles.
>>>>> 2: Call only filter for current event type.
>>>>>    Filter register a event type on ltt_module_register, and only used to
>>>>>    process that type.
>>>>> I think 2 is better because it uses less CPU.
>>>>>
>>>> Yes, filters definitely have to be called only for their own event ID
>>>> for performance reasons.
>>>>
>>>>>> This module would be called by ltt_vtrace and _ltt_specialized_trace
>>>>>> with this test :
>>>>>>
>>>>>>                 if (unlikely(!ltt_run_filter(trace, eID)))
>>>>>>                         continue;
>>>>>>
>>>>>> for each active trace.
>>>>> I think call filter in tracepoint's callback as probe_jbd2_checkpoint is
>>>>> more efficient than this way.
>>>>> Consider that a event which is filtered(don't send to relay), more
>>>>> unnecessary process is done before ltt_vtrace if we call filter in ltt_vtrace.
>>>>>
>>>>> So, in generalization filter:
>>>>> 1: call filter in ltt_vtrace(_ltt_specialized_trace)
>>>>> 2: call filter in probe_XXX_checkpoint()
>>>>> I think 2 is better.
>>>>>
>>>>> Which is your opinion about this?
>>>>>
>>>> There is a feature of LTTng which would benefit of 1 : lttng can have
>>>> multiple trace sessions active at once. Therefore, if one trace session
>>>> needs a subset of events and another trace session need a different set,
>>>> then they could each have their own filter structure associated with
>>>> them. I think having this level of flexibility is more important than
>>>> the few cycles we could save by doing (2). Also, we already have the
>>>> tracepoints and markers to deactivate the event source at the kernel
>>>> level when we need to have nearly-zero overhead.
>>>>
>>>> Mathieu
>>>>
>>>>> B.R.
>>>>> Zhaolei
>>>>>
>>>>>> This filter would receive the trace information and the event ID.
>>>>>> We may have to add or change some parameters to this to support
>>>>>> filtering by fields. For the filter called from ltt_vtrace, passing :
>>>>>>
>>>>>> const struct marker *mdata
>>>>>> and
>>>>>> const char *fmt, va_list *args
>>>>>>
>>>>>> Should be more than enough to filter generically by
>>>>>> - channel name
>>>>>> - event name
>>>>>> - field name -> typed field data.
>>>>>>
>>>>>> Filtering should be pre-computed as much as possible and be O(1) when
>>>>>> executed. Creating callbacks for each expected data type to filter will
>>>>>> probably be requried. A technique similar to what we have done in lttv
>>>>>> filter.c should be considered.
>>>>>>
>>>>>> For specialized probes, it might be more difficult to do this
>>>>>> generically, because _ltt_specialized_trace has no knowledge of the
>>>>>> event fields. I guess we would have to create "specialized" filtering
>>>>>> callbacks for those custom trace points. It's their nature anyway.
>>>>>>
>>>>>> Please ask if anything is unclear. Comments/ideas are welcome.
>>>>>>
>>>>>> Mathieu
>>>>>>
>>>>> _______________________________________________
>>>>> ltt-dev mailing list
>>>>> ltt-dev at lists.casi.polymtl.ca
>>>>> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>>>>>
> 
> 
> 
> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> 






More information about the lttng-dev mailing list