[ltt-dev] Unable to stop trace when started with -a option

Mathieu Desnoyers compudj at krystal.dyndns.org
Wed Jul 29 14:21:15 EDT 2009


Basically, what I am saying is that the newest ltt-control should not
hang in trace destroy and correctly support the -a (append) option. Do
you still experience the same problem ?

As a side-note, lttd did not seem to correctly handle the splice error
codes. But given ltt-control 0.69 does not use O_APPEND anymore, no
error code should be returned, and it should just work.

Mathieu

* Dany Madden (dany.r.madden at gmail.com) wrote:
> Hi Mathieu,
> 
> I tested your change and trace destruction still hang as you said. I
> looked at lttd and and it is handling error from splice. But the
> problem is that splice is not returning for lttd to handle the return
> code, right? So, I am not sure how lttd's error handling is relavant
> here.
> 
> Maybe I am not understanding what you are saying.
> 
> Dany
> 
> On 7/23/09, Mathieu Desnoyers <compudj at krystal.dyndns.org> wrote:
> > * Mathieu Desnoyers (compudj at krystal.dyndns.org) wrote:
> >> * Dany Madden (dany.r.madden at gmail.com) wrote:
> >> [...]
> >>
> >
> > Can you have a look at ltt-control 0.69 ? I fixed the O_APPEND issue by
> > using lseek SEEK_END instead. I did not modify the error handling, so
> > other splice errors could possibly still cause trace destruction to hang.
> >
> > Thanks,
> >
> > Mathieu
> >
> >> OK, part of the problem is lttd which should not use O_APPEND, because
> >> it is not supported by splice(). The other part of the problem is that
> >> lttd seems to exit from the splice error leaving a reference to the
> >> subbuffers. e.g.
> >>
> >> ltt-armall
> >>
> >> #First trace
> >> lttctl -C trace
> >> lttd -t /tmp/trace -c /mnt/debugfs/ltt/trace -v &
> >> #works ok
> >> sleep 5
> >> lttctl -D trace
> >>
> >> #Second trace
> >> lttctl -C trace
> >> lttd -t /tmp/trace -c /mnt/debugfs/ltt/trace -v -a
> >>
> >> Gives:
> >>
> >> Appending to file /tmp/trace/metadata_1 as requested
> >> Channel file : /mnt/debugfs/ltt/trace/metadata_0
> >> Opening file.
> >> Appending to file /tmp/trace/metadata_0 as requested
> >> Data received
> >> Normal read on fd 5
> >> cookie : 0
> >> splice chan to pipe offset 0
> >> splice chan to pipe ret 65536
> >> splice pipe to file -1
> >> Error in file splice: Invalid argument
> >> Data received
> >> Normal read on fd 325
> >> cookie : 0
> >> splice chan to pipe offset 0
> >>
> >> Which matches:
> >>
> >> fs/splice.c:
> >>
> >> static long do_splice_from(struct pipe_inode_info *pipe, struct file *out,
> >>                            loff_t *ppos, size_t len, unsigned int flags)
> >> ...
> >>         if (unlikely(out->f_flags & O_APPEND))
> >>                 return -EINVAL;
> >>
> >> Can you have a closer look at the lttd error handling for splice errors?
> >> Once we fix this issue to handle it gracefully, we'll have to find
> >> another way to support file append without O_APPEND.
> >>
> >> Thanks,
> >>
> >> Mathieu
> >>
> >>
> >> > Dany
> >> >
> >> > On Tue, Jul 21, 2009 at 5:07 PM, Mathieu Desnoyers <
> >> > compudj at krystal.dyndns.org> wrote:
> >> >
> >> > > * Dany Madden (dany.r.madden at gmail.com) wrote:
> >> > > > Hello,
> >> > > >
> >> > > > I wanted to follow up on my post from last week. Is this a bug in
> >> > > > lttctl
> >> > > or
> >> > > > is there something wrong with the way I am calling lttctl? Any
> >> > > suggestions
> >> > > > on how to get to the bottom of this issue is much appreciated.
> >> > > >
> >> > >
> >> > > Hrm, this problem still applies to the current LTTng.
> >> > >
> >> > > I'd bet splice() has problems with O_APPEND. Can you have a look into
> >> > > this ? It could be caused by the splice actor implementation found in
> >> > > LTTng.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Mathieu
> >> > >
> >> > > > Thanks,
> >> > > > Dany
> >> > > >
> >> > > > On Thu, Jul 16, 2009 at 8:25 AM, Dany Madden
> >> > > > <dany.r.madden at gmail.com
> >> > > >wrote:
> >> > > >
> >> > > > > Hi,
> >> > > > >
> >> > > > > When I started a trace with the -a option, destroying the trace
> >> > > > > later
> >> > > would
> >> > > > > hang. These are the ltt versions that I am using:
> >> > > > >
> >> > > > > lttv: 0.12.7
> >> > > > > lttng: 0.72 (patched in sles11 2.6.27.19 kernel)
> >> > > > > lttcontrol: 0.64
> >> > > > >
> >> > > > > It appears that lttctl is waiting on lttd to die, but lttd is
> >> > > > > waiting
> >> > > for
> >> > > > > something else to finish. Please see sysrq-t output below.
> >> > > > >
> >> > > > > Restart tracing with append option -a
> >> > > > >
> >> > > > > ------------------------------------------------------
> >> > > > >
> >> > > > > punetest3:~ # lttctl -C -w /tmp/trace1 trace2 -a
> >> > > > > Linux Trace Toolkit Trace Control 0.64-14012009
> >> > > > >
> >> > > > > Controlling trace : trace2
> >> > > > >
> >> > > > > lttctl: Creating trace
> >> > > > >
> >> > > > > lttctl: Forking lttd
> >> > > > > Linux Trace Toolkit Trace Daemon 0.64-14012009
> >> > > > >
> >> > > > > Reading from debugfs directory : /sys/kernel/debug/ltt/trace2
> >> > > > > Writing to trace directory : /tmp/trace1
> >> > > > >
> >> > > > > lttctl: Starting trace
> >> > > > > punetest3:~ # echo $?
> >> > > > >
> >> > > > > 0
> >> > > > >
> >> > > > > Stop tracing hangs
> >> > > > > ----------------------------------------
> >> > > > > punetest3:~ # lttctl -D trace1
> >> > > > > Linux Trace Toolkit Trace Control 0.64-14012009
> >> > > > >
> >> > > > > Controlling trace : trace1
> >> > > > >
> >> > > > > lttctl: Pausing trace
> >> > > > >
> >> > > > > lttctl: Destroying trace
> >> > > > > ^C
> >> > > > >
> >> > > > > ---> Ran it thru gdb
> >> > > > >
> >> > > > > Program received signal SIGINT, Interrupt.
> >> > > > > 0xffffe430 in __kernel_vsyscall ()
> >> > > > > (gdb) bt
> >> > > > > #0  0xffffe430 in __kernel_vsyscall ()
> >> > > > > #1  0xb7f34bb3 in write () from /lib/libc.so.6
> >> > > > >
> >> > > > > #2  0xb7fd013c in lttctl_sendop (fname=0xbfffe5cc
> >> > > > > "/sys/kernel/debug/ltt/destroy_trace", op=0xbffff843 "trace3") at
> >> > > > > liblttctl.c:120
> >> > > > > #3  0xb7fd0ecb in lttctl_destroy_trace (name=0xbffff843 "trace3")
> >> > > > > at
> >> > > > >
> >> > > > > liblttctl.c:284
> >> > > > > #4  0x080498d6 in main (argc=1667330676, argv=0x4c003365) at
> >> > > lttctl.c:823
> >> > > > >
> >> > > > > --> It hangs in lttctl_sendop() while doing write.
> >> > > > >
> >> > > > > lttctl_sendop (fname=0xbfffe5bc
> >> > > > > "/sys/kernel/debug/ltt/destroy_trace",
> >> > > > >
> >> > > > > op=0xbffff83c "trace7")
> >> > > > >     at liblttctl.c:108
> >> > > > > 108             if (!fname) {
> >> > > > > (gdb)
> >> > > > > 105     {
> >> > > > > (gdb)
> >> > > > > 108             if (!fname) {
> >> > > > > (gdb)
> >> > > > > 113             fd = open(fname, O_WRONLY);
> >> > > > > (gdb)
> >> > > > >
> >> > > > > 114             if (fd == -1) {
> >> > > > > (gdb)
> >> > > > > 113             fd = open(fname, O_WRONLY);
> >> > > > > (gdb)
> >> > > > > 114             if (fd == -1) {
> >> > > > > (gdb)
> >> > > > > 120             if (write(fd, op, strlen(op)) == -1) {
> >> > > > >  (gdb) p fd
> >> > > > > $3 = <value optimized out>
> >> > > > >
> >> > > > > (gdb) p op
> >> > > > > $4 = 0xbffff83c "trace7"
> >> > > > > (gdb) p strlen(op)
> >> > > > > $5 = 6
> >> > > > > (gdb) c
> >> > > > > Continuing.
> >> > > > >
> >> > > > > ----> Output from sysrq-t
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel: lttd          S 00000000     0
> >> > > > > 3979
> >> > >    1
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:        f6373e20 00000086 0000000c
> >> > > 00000000
> >> > > > > 00000000 00013e36 c05bd67c c05c0380
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:        c05c0380 c05c0380 c05c0380
> >> > > f785f2c0
> >> > > > > f785f534 c2225380 00000002 c017f8b0
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:        b7e8b000 b7e8bbd8 f6373ea8
> >> > > f785f534
> >> > > > > ffffef5d c05521d8 00000001 00000000
> >> > > > > Jul 15 17:47:27 elm3b36 kernel: Call Trace:
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c0147cd9>]
> >> > > > > futex_wait+0x207/0x3d0
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c01490e5>] do_futex+0x7a/0x8c9
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c0149a33>] sys_futex+0xff/0x112
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c0104de2>] syscall_call+0x7/0xb
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<ffffe430>] 0xffffe430
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  =======================
> >> > > > > Jul 15 17:47:27 elm3b36 kernel: lttd          S c1000000     0
> >> > > > > 3980
> >> > >    1
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:        f6375dc0 00000086 0007f4a0
> >> > > c1000000
> >> > > > >
> >> > > > > b7ff1c40 00011668 c05bd67c c05c0380
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:        c05c0380 c05c0380 c05c0380
> >> > > f7bb62b0
> >> > > > > f7bb6524 c221b380 00000001 00000000
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:        f6f60848 f6354b7c 00000000
> >> > > f7bb6524
> >> > > > >
> >> > > > > 000005b7 00000004 00000004 00000000
> >> > > > > Jul 15 17:47:27 elm3b36 kernel: Call Trace:
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c01989e3>] pipe_wait+0x50/0x6e
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c01afe80>]
> >> > > splice_to_pipe+0x164/0x1e9
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c023db08>]
> >> > > subbuf_splice_actor+0x167/0x172
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c023db40>]
> >> > > > > ltt_relay_file_splice_read+0x2d/0x6b
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c01af28c>]
> >> > > > > do_splice_to+0x56/0x64
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c01af419>]
> >> > > > > sys_splice+0x17f/0x1de
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c0104ceb>]
> >> > > sysenter_do_call+0x12/0x2f
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<ffffe430>] 0xffffe430
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  =======================
> >> > > > > Jul 15 17:47:27 elm3b36 kernel: lttctl        S 00000086     0
> >> > > > > 3981
> >> > > 3869
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:        f6c03d10 00000082 00000001
> >> > > 00000086
> >> > > > > f6375c7c 0001f77c c05bd67c c05c0380
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:        c05c0380 c05c0380 c05c0380
> >> > > f74dc3b0
> >> > > > > f74dc624 c222f380 00000003 f624f790
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:        00000282 00000003 f624f790
> >> > > f74dc624
> >> > > > > 000005b3 c05ce964 f79b4c00 00000000
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel: Call Trace:
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c023bbdc>]
> >> > > __ltt_trace_destroy+0xdc/0xfb
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c023be4d>]
> >> > > ltt_trace_destroy+0x2d/0x77
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<f89cd070>]
> >> > > destroy_trace_write+0x70/0x13b
> >> > > > >
> >> > > > > [ltt_trace_control]
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c019337c>] vfs_write+0x8b/0x13d
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c0193526>] sys_write+0x45/0x94
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<c0104de2>] syscall_call+0x7/0xb
> >> > > > >
> >> > > > > Jul 15 17:47:27 elm3b36 kernel:  [<ffffe430>] 0xffffe430
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > -----
> >> > > > > ~ Dany Madden
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > -----
> >> > > > ~ Dany Madden
> >> > >
> >> > > > _______________________________________________
> >> > > > ltt-dev mailing list
> >> > > > ltt-dev at lists.casi.polymtl.ca
> >> > > > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> >> > >
> >> > >
> >> > > --
> >> > > Mathieu Desnoyers
> >> > > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE
> >> > > 9A68
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > -----
> >> > ~ Dany Madden
> >>
> >> --
> >> Mathieu Desnoyers
> >> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE
> >> 9A68
> >>
> >> _______________________________________________
> >> ltt-dev mailing list
> >> ltt-dev at lists.casi.polymtl.ca
> >> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> >>
> >
> > --
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> >
> 
> 
> -- 
> -----
> ~ Dany Madden
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68




More information about the lttng-dev mailing list