[ltt-dev] lttv crashed in tracefile.c:1582

Dany Madden dany4madden at gmail.com
Fri Apr 2 16:18:59 EDT 2010


On Fri, Apr 2, 2010 at 7:34 AM, Mathieu Desnoyers <
compudj at krystal.dyndns.org> wrote:

> * Dany Madden (dany4madden at gmail.com) wrote:
> > On Wed, Mar 31, 2010 at 1:56 PM, Mathieu Desnoyers <
> > compudj at krystal.dyndns.org> wrote:
> >
> > > * Mathieu Desnoyers (compudj at krystal.dyndns.org) wrote:
> > > > * Dany Madden (dany4madden at gmail.com) wrote:
> > > > > On Wed, Mar 31, 2010 at 10:24 AM, Mathieu Desnoyers <
> > > > > compudj at krystal.dyndns.org> wrote:
> > > > >
> > > > > > * Dany Madden (dany4madden at gmail.com) wrote:
> > > > > > > Hello,
> > > > > > >
> > > > > > > Please take a look at this lttv error with the mainline kernel
> > > > > > > 2.6.32.9 on ppc64.
> > > > > > >
> > > > > > > lttng: patch-2.6.32.9-lttng-0.198
> > > > > > > lttv-0.12.30-02102010
> > > > > > > ltt-control-0.79-01022010
> > > > > > >
> > > > > > > LTTV crashed with this error:
> > > > > > >
> > > > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion failed:
> > > (info !=
> > > > > > NULL)
> > > > > > > /usr/bin/lttv: line 15:  2696 Aborted                 $0.real
> $*
> > > > > > >
> > > > > > >
> > > > > > > CONFIG_LTT=y
> > > > > > > CONFIG_LTT_FILTER=m
> > > > > > > # CONFIG_HAVE_LTT_DUMP_TABLES is not set
> > > > > > > CONFIG_LTT_RELAY=y
> > > > > > > CONFIG_LTT_RELAY_LOCKLESS=y
> > > > > > > # CONFIG_LTT_RELAY_IRQOFF is not set
> > > > > > > # CONFIG_LTT_RELAY_LOCKED is not set
> > > > > > > CONFIG_LTT_SERIALIZE=y
> > > > > > > CONFIG_LTT_FAST_SERIALIZE=y
> > > > > > > CONFIG_LTT_TRACEPROBES=m
> > > > > > > CONFIG_LTT_TRACE_CONTROL=m
> > > > > > > CONFIG_LTT_TRACER=y
> > > > > > > # CONFIG_LTT_ALIGNMENT is not set
> > > > > > > CONFIG_LTT_CHECK_ARCH_EFFICIENT_UNALIGNED_ACCESS=y
> > > > > > > # CONFIG_LTT_DEBUG_EVENT_SIZE is not set
> > > > > > > CONFIG_LTT_USERSPACE_EVENT=m
> > > > > > > CONFIG_LTT_VMCORE=y
> > > > > > > CONFIG_LTT_STATEDUMP=m
> > > > > > > CONFIG_LTT_KPROBES=m
> > > > > > > # CONFIG_LTT_ASCII is not set
> > > > > > >
> > > > > > >
> > > > > > > > modprobe ltt-trace-control
> > > > > > > > modprobe ipc-trace
> > > > > > > > modprobe kernel-trace
> > > > > > > > modprobe mm-trace
> > > > > > > > modprobe fs-trace
> > > > > > > > modprobe syscall-trace
> > > > > > > > modprobe trap-trace
> > > > > > >
> > > > > > > > ltt-armall
> > > > > > > > lttctl -C -w /tmp/trace1 trace1
> > > > > > > > lttctl -D trace1
> > > > > > > > lttv -m textDump -t /tmp/trace1
> > > > > > > **
> > > > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion failed:
> > > (info !=
> > > > > > NULL)
> > > > > > > /usr/bin/lttv: line 15:  2696 Aborted                 $0.real
> $*
> > > > > > >
> > > > > > > This post,
> > > > > >
> > >
> http://lists.casi.polymtl.ca/pipermail/ltt-dev/2010-February/002189.html,
> > > > > > > describes the same error on 32 bit system. By not having the
> > > > > > > LTT_ALIGNMENT set, shouldn't it just work in my case?
> > > > > > >
> > > > > > > Thanks in advance for your help!
> > > > > >
> > > > > > A few questions:
> > > > > >
> > > > > > - the trace is taken on a ppc64 machine. Is is viewed on a ppc64
> too,
> > > or
> > > > > >  on a different arch size/endianness ?
> > > > > >
> > > > >
> > > > > Trace collected and viewed on the same ppc64 system.
> > > > >
> > > > >
> > > > > >
> > > > > > - Can you provide some information about the content of the trace
> > > > > >  directory ? (files, file size)
> > > > > >
> > > > >
> > > > > The files in the trace directory are all the same size??
> > > > >
> > > > > > /tmp/trace3 # ls -l
> > > > > total 2448
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_0
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_1
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_2
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_3
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_0
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_1
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_2
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_3
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_0
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_1
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_2
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_3
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_0
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_1
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_2
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_3
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_0
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_1
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_2
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_3
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_0
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_1
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_2
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_3
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_0
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_1
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_2
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_3
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_0
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_1
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_2
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_3
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_0
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_1
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_2
> > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_3
> > > > >
> > > >
> > > > Oh, that's a pretty good hint. Although I tried to make everything as
> > > > generic as possible, I think LTTV assumes that the page size is 4k.
> > > > However, recent ppc64 kernels have a 64k page size. So I'm fairly
> sure
> > > > that this is where the problem comes from.
> > > >
> > > > Can you send me (privately) a tarball of your trace so I can see what
> I
> > > > can do with LTTV to support it ?
> > >
> > > OK, got it. Well, your metadata channels are all empty (looked with
> > > hexedit). So clearly LTTV cannot make sense of the data in the other
> > > channels.
> > >
> > > So let's try the simple solutions first: try to load all the
> > > LTTng-related modules you have (I see that a few were missing from your
> > > list), and only then run ltt-armall. If that does not work, try to
> > > compile LTTng as "built in".
> > >
> >
> > Tried building ltt into the kernel and still seeing the assertion
> failed...
> > > lttv -m textDump -t /tmp/trace7
> > **
> > ERROR:tracefile.c:1582:ltt_update_event_size: assertion failed: (info !=
> > NULL)
> > /usr/bin/lttv: line 15:  2677 Aborted                 $0.real $*
>
>
> I just looked at your trace5, and it looks like this one has empty
> "metadata" channels too. So I am pretty confident that there is a
> problem with 4k vs 64k alignment, but I cannot test the fix until we
> find out why the metadata channel is empty.
>
> Can you give me the output of:
>
> # find /mnt/debugfs/ltt
>
 I attached the output mnt_debugfs_ltt_content because it is too long to
list here


>  ? (or with the appropriate location of debugfs on your system)
>
> Normally, we should have the metadata generated by events in:
>
> kernel/marker.c: ltt_dump_marker_state()
>
> So double-checking that this symbol is present in your kernel might
> help. Also making sure that CONFIG_MARKERS is set would be good. To
> confirm that ltt_dump_marker_state() is called, you could add some
> printk() in that function to see if LTTng is calling it at it should.
>

CONFIG_MARKERS is set. I added a printk to ltt_dump_marker_state and saw
that the function is being called. This is the log from /var/log/messages:

Apr  2 20:06:37 elm17d191 kernel: LTT dump marker  state is called ### My
printk statement #####
Apr  2 20:06:37 elm17d191 kernel:
Apr  2 20:06:37 elm17d191 kernel: LTT state dump begin
Apr  2 20:06:37 elm17d191 kernel: LTT state dump thread start
Apr  2 20:06:37 elm17d191 kernel: LTT state dump end
Apr  2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel
softirq_state (cpu 0, index 0)
Apr  2 20:07:34 elm17d191 kernel: LTT: 1 events written in channel
global_state (cpu 0, index 0)
Apr  2 20:07:34 elm17d191 kernel: LTT: 1890 events written in channel kernel
(cpu 0, index 0)
Apr  2 20:07:34 elm17d191 kernel: LTT: 525 events written in channel kernel
(cpu 1, index 0)
Apr  2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel kernel
(cpu 2, index 0)
Apr  2 20:07:34 elm17d191 kernel: LTT: 48 events written in channel kernel
(cpu 3, index 0)
Apr  2 20:07:34 elm17d191 kernel: LTT: 7 events written in channel fs (cpu
1, index 0)
Apr  2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel block
(cpu 0, index 0)
Apr  2 20:07:34 elm17d191 kernel: LTT: 14 events written in channel block
(cpu 2, index 0)



>
> Thanks,
>
> Mathieu
>
> >
> > dany
> >
> >
> > >
> > > Thanks,
> > >
> > > Mathieu
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Mathieu
> > > >
> > > > >
> > > > > >
> > > > > > Also, trying to enable "CONFIG_LTT_DEBUG_EVENT_SIZE" might help
> by
> > > > > > providing extra self-check on trace metadata consistency.
> > > > > >
> > > > > > Rebuilt the kernel with CONFIG_LTT_DEBUG_EVENT_SIZE=y ,
> recollected
> > > the
> > > > > trace and viewed it. I still see the same error and did notice
> anything
> > > else
> > > > > different than it was before.
> > > > >
> > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Mathieu
> > > > > >
> > > > > > >
> > > > > > > Dany Madden
> > > > > >
> > > > > > > _______________________________________________
> > > > > > > ltt-dev mailing list
> > > > > > > ltt-dev at lists.casi.polymtl.ca
> > > > > > > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Mathieu Desnoyers
> > > > > > Operating System Efficiency R&D Consultant
> > > > > > EfficiOS Inc.
> > > > > > http://www.efficios.com
> > > > > >
> > > >
> > > > --
> > > > Mathieu Desnoyers
> > > > Operating System Efficiency R&D Consultant
> > > > EfficiOS Inc.
> > > > http://www.efficios.com
> > >
> > > --
> > > Mathieu Desnoyers
> > > Operating System Efficiency R&D Consultant
> > > EfficiOS Inc.
> > > http://www.efficios.com
> > >
>
> > _______________________________________________
> > ltt-dev mailing list
> > ltt-dev at lists.casi.polymtl.ca
> > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.casi.polymtl.ca/pipermail/lttng-dev/attachments/20100402/918bc5e7/attachment-0003.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mnt_debugfs_ltt_content
Type: application/octet-stream
Size: 24208 bytes
Desc: not available
URL: <http://lists.casi.polymtl.ca/pipermail/lttng-dev/attachments/20100402/918bc5e7/attachment-0003.obj>


More information about the lttng-dev mailing list