[ltt-dev] [PATCH] LTTNG: Make marker enabled in advance

Mathieu Desnoyers compudj at krystal.dyndns.org
Fri Mar 6 13:21:51 EST 2009


* Gui Jianfeng (guijianfeng at cn.fujitsu.com) wrote:
> Mathieu Desnoyers wrote:
> > * Gui Jianfeng (guijianfeng at cn.fujitsu.com) wrote:
> >> Mathieu Desnoyers wrote:
> >>> * Gui Jianfeng (guijianfeng at cn.fujitsu.com) wrote:
> >>>> Mathieu Desnoyers wrote:
> >>>>> * Gui Jianfeng (guijianfeng at cn.fujitsu.com) wrote:
> >>>>>> Hi Mathieu,
> >>>>>>
> >>>>>> This patch makes marker enabled in advance.
> >>>>>> IOW, even if the marker isn't present for the moment,
> >>>>>> you can still enable it for future use.
> >>>>>> As soon as the marker is inserted into kernel, tracing 
> >>>>>> work can be started immediately.
> >>>>>>
> >>>>>> Here is an example for using the user interface:
> >>>>>> This patch assumes the marker control patch is applied.
> >>>>>>
> >>>>>> [root at localhost markers]# cd /mnt/debugfs/ltt/markers
> >>>>>> [root at localhost markers]# mkdir fs
> >>>>>> [root at localhost markers]# cd fs
> >>>>>> [root at localhost fs]# mkdir close
> >>>>>> [root at localhost fs]# cd close/
> >>>>>> [root at localhost close]# ls
> >>>>>> enable  info
> >>>>>> [root at localhost close]# echo 1 > enable
> >>>>>> [root at localhost close]# cat enable
> >>>>>> 2
> >>>>>> [root at localhost close]# cat info
> >>>>>> marker is not present now!
> >>>>>> [root at localhost close]# modprobe fs_trace
> >>>>>> [root at localhost close]# cat enable
> >>>>>> 1
> >>>>>> [root at localhost close]# cat info
> >>>>>> format: "fd %u"
> >>>>>> state: 1
> >>>>>> event_id: 1
> >>>>>> call: 0xc0468167
> >>>>>> probe single : 0xc0520cdc
> >>>>>>
> >>>>>> Signed-off-by: Gui Jianfeng <guijianfeng at cn.fujitsu.com>
> >>>>>> ---
> >>>>>>  ltt/ltt-trace-control.c |  357 ++++++++++++++++++++++++++++++++---------------
> >>>>>>  1 files changed, 247 insertions(+), 110 deletions(-)
> >>>>>>
> >>>>>> diff --git a/ltt/ltt-trace-control.c b/ltt/ltt-trace-control.c
> >>>>>> index 128db4e..7e1c32e 100644
> >>>>>> --- a/ltt/ltt-trace-control.c
> >>>>>> +++ b/ltt/ltt-trace-control.c
> >>>>>> @@ -19,6 +19,7 @@
> >>>>>>  #include <linux/debugfs.h>
> >>>>>>  #include <linux/ltt-tracer.h>
> >>>>>>  #include <linux/notifier.h>
> >>>>>> +#include <linux/mutex.h>
> >>>>>>  
> >>>>>>  #define LTT_CONTROL_DIR "control"
> >>>>>>  #define MARKERS_CONTROL_DIR "markers"
> >>>>>> @@ -27,9 +28,240 @@
> >>>>>>  
> >>>>>>  #define LTT_WRITE_MAXLEN	(128)
> >>>>>>  
> >>>>>> +#define MARKER_ENABLE_MASK 0x7UL
> >>>>> What is this doing ?
> >>>>   marker is 8 bytes aligned, the last 3 bit must 0.
> >>>>   I use these bits to identify whether the marker is enabled in advance.
> >>>>
> >>> Is there a clear gain to encode such information in the pointer bits ?
> >>> Can we simply add a "char" somewhere which would be easier for the code
> >>> reviewers ? I rarely found out that this addition of complexity was
> >>> worth it, except when some updates need to be done atomically.
> >>   Hi Mathieu,
> >>
> >>   I make use of inode->i_private of the file "enable" to store the corresponding 
> >>   marker address. If marker isn't present in kernel, but someone want to enable 
> >>   it in advance, inode->i_private will be 0x00000001. We don't need additional 
> >>   space to store *Pre-enable* status. 
> >>   If you have a better solution, could you give me some suggestions?
> >>
> > 
> > Oh ! I see, you keep a pointer to struct marker as i_private of the
> > marker files in the debugfs tree. This is buggy, here is why :
> > 
> > If the follow happens :
> > 
> > module A has marker1
> > module B has marker1 (too)
> > 
> > We have :
> > 
> > insmod module A
> >   -> marker1 debugfs files created, pointing to module A marker1
> > insmod module B
> >   -> marker1 already exists, no update done, but "marker1" i_private
> >      still points to module A's marker1
> > rmmod module A
> >   -> removing marker1, but module B still has one which should show up
> >      in the directory tree.
> 
>   marker1 will not show up any more, it will gone when removing module A.
>   So this won't cause serious problem. but this is a problem for sure, 
>   beacuse current implementation can't handle same name marker....
> 
> > 
> > How to fix this ? Simply by _not_ keeping a direct reference to
> > struct marker within ltt-trace-control.c.
> > 
> > We have a hash table in kernel/marker.c which takes care of exactly
> > this: it keeps a reference could of each marker loaded. Therefore, if
> > you need to get information about the state of such marker, you should
> > really use primitives like include/linux/marker.h:is_marker_enabled(),
> > and create your own primitives to export the information you need from
> > marker.c.
> > 
> > This will solve you "is enabled" problem, because this hash table keeps
> > track of _activated_ markers. Therefore, for each marker you encounter
> > in the marker section (what you are iterating on upon module load
> > notification), you just have to query the is_marker_enabled() state for
> > the channel and marker name to get its state.
> > 
> > You will however need to be called also when the marker state changes.
> > One way to do this would be to add a marker update notifier chain to
> > kernel/marker.c which would call your notifier callback.
> > 
> > Looking at the register_module_notifier() use you are doing in
> > ltt_trace_control makes think of the following : kernel/marker.c should
> > use register_module_notifier(), but your module should not be called by
> > module.c. It should rather be called by marker.c, which is the module
> > which modifies the data structures your are interested to follow.
> > 
> > Therefore, once you do that, keeping track of the marker state changes
> > will become _much_ easier, and you won't need any of these weird pointer
> > handling stuff.
> 

Hi Gui,

>   buffering a marker pointer can save search-time. IOW, when read "info" or "enable"
>   to query information for a given marker, it has to iterate all the marker section
>   in kernel and each modules to find out the marker.

How often to you need the marker info ?

>  and this also save the time for 
>   calling is_marker_enabled() to search the hash table.
> 
>   My solution to solve the "same name marker" problem is that buffering the marker
>   pointer in i_private if a marker doesn't have another same name marker. If there are 
>   two same name markers available, the 2nd lowest bit of i_private will indicate this situation
>   and we need to iterate all the markers in kernel and modules to find out the corresponding
>   markers.
> 

You are optimizing a slow path. This is non-sense. Please use a solution
that is clean and works (and implies a hash table lookup) rather than to
do caches that will have no observable performance effects. And I cannot
stress enough the fact that the current situation is broken and will
leave marker modules unavailable in some module load/unload sequence.
We do not care about performance for marker connection/disconnection,
because this operation is rarely done, but we care a great deal about
correctness.


> > 
> >>>>>> +
> >>>>>> +static DEFINE_MUTEX(unload_mutex);
> >>>>> Can you tell me exactly what this protects, where it is nested ?
> >>>>   when doing mkdir to create a new dir by debugfs_create_dir(), I have to drop
> >>>>   the parent's i_mutex, because debugfs_create_dir() need it. To prevent the 
> >>>>   parent to be removed, this mutex is introduced.
> >>>>
> >>> So you want to protect against racy removal ? Which module is
> >>> responsible for such removal, and which mutex protects addition/removal?
> >>> I really doubt that anything near this unload_mutex is the right
> >>> solution.
> >>   Suppose we are creating a new channel dir, meanwhile we remove the ltt-trace-control
> >>   module, all debugfs file will be removed. but debugfs_create_dir() is still in 
> >>   processing, but the ltt/marker is gone. Will this scenario happen?
> >>
> > 
> > Hrm, upon entry in
> > 
> > fs/namei.c: mkdirat, we have :
> > 
> > dentry = lookup_create(&nd, 1);
> > 
> > Which looks up for the parent path. This will hold the dentry counter on
> > the parent. It will execute the vfs_mkdir with this reference count
> > held. Therefore, the mkdir functions you created are called with the
> > parent's dentry refcount held, is that enough to insure the kind of
> > protection you want ?
> 
>   Is it possible that all other files are remove except the one we protects
>   and new created files. If this can happen, i don't think it's ok.
>   The goal of unload_mutex is protecting the whole tree. 
> 

Do we need to protect more than the parent when we do this ? Upon exit
from mkdir, the parent refcount will be released, and then if the parent
had to be deleted, it will be deleted... Unless you call more
specifically the data structure races, I won't be convinced that there
is a problem there.

Mathieu

> > 
> > Mathieu
> > 
> > 
> >>> Mathieu
> >>>
> >>>>> Mathieu
> >>>>>
> >>>>>> +
> >>>> -- 
> >>>> Regards
> >>>> Gui Jianfeng
> >>>>
> >> -- 
> >> Regards
> >> Gui Jianfeng
> >>
> > 
> 
> -- 
> Regards
> Gui Jianfeng
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68




More information about the lttng-dev mailing list