[ltt-dev] lttv crashed in tracefile.c:1582

Mathieu Desnoyers compudj at krystal.dyndns.org
Fri Apr 2 16:49:29 EDT 2010


* Dany Madden (dany4madden at gmail.com) wrote:
> On Fri, Apr 2, 2010 at 7:34 AM, Mathieu Desnoyers <
> compudj at krystal.dyndns.org> wrote:
> 
> > * Dany Madden (dany4madden at gmail.com) wrote:
> > > On Wed, Mar 31, 2010 at 1:56 PM, Mathieu Desnoyers <
> > > compudj at krystal.dyndns.org> wrote:
> > >
> > > > * Mathieu Desnoyers (compudj at krystal.dyndns.org) wrote:
> > > > > * Dany Madden (dany4madden at gmail.com) wrote:
> > > > > > On Wed, Mar 31, 2010 at 10:24 AM, Mathieu Desnoyers <
> > > > > > compudj at krystal.dyndns.org> wrote:
> > > > > >
> > > > > > > * Dany Madden (dany4madden at gmail.com) wrote:
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > Please take a look at this lttv error with the mainline kernel
> > > > > > > > 2.6.32.9 on ppc64.
> > > > > > > >
> > > > > > > > lttng: patch-2.6.32.9-lttng-0.198
> > > > > > > > lttv-0.12.30-02102010
> > > > > > > > ltt-control-0.79-01022010
> > > > > > > >
> > > > > > > > LTTV crashed with this error:
> > > > > > > >
> > > > > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion failed:
> > > > (info !=
> > > > > > > NULL)
> > > > > > > > /usr/bin/lttv: line 15:  2696 Aborted                 $0.real
> > $*
> > > > > > > >
> > > > > > > >
> > > > > > > > CONFIG_LTT=y
> > > > > > > > CONFIG_LTT_FILTER=m
> > > > > > > > # CONFIG_HAVE_LTT_DUMP_TABLES is not set
> > > > > > > > CONFIG_LTT_RELAY=y
> > > > > > > > CONFIG_LTT_RELAY_LOCKLESS=y
> > > > > > > > # CONFIG_LTT_RELAY_IRQOFF is not set
> > > > > > > > # CONFIG_LTT_RELAY_LOCKED is not set
> > > > > > > > CONFIG_LTT_SERIALIZE=y
> > > > > > > > CONFIG_LTT_FAST_SERIALIZE=y
> > > > > > > > CONFIG_LTT_TRACEPROBES=m
> > > > > > > > CONFIG_LTT_TRACE_CONTROL=m
> > > > > > > > CONFIG_LTT_TRACER=y
> > > > > > > > # CONFIG_LTT_ALIGNMENT is not set
> > > > > > > > CONFIG_LTT_CHECK_ARCH_EFFICIENT_UNALIGNED_ACCESS=y
> > > > > > > > # CONFIG_LTT_DEBUG_EVENT_SIZE is not set
> > > > > > > > CONFIG_LTT_USERSPACE_EVENT=m
> > > > > > > > CONFIG_LTT_VMCORE=y
> > > > > > > > CONFIG_LTT_STATEDUMP=m
> > > > > > > > CONFIG_LTT_KPROBES=m
> > > > > > > > # CONFIG_LTT_ASCII is not set
> > > > > > > >
> > > > > > > >
> > > > > > > > > modprobe ltt-trace-control
> > > > > > > > > modprobe ipc-trace
> > > > > > > > > modprobe kernel-trace
> > > > > > > > > modprobe mm-trace
> > > > > > > > > modprobe fs-trace
> > > > > > > > > modprobe syscall-trace
> > > > > > > > > modprobe trap-trace
> > > > > > > >
> > > > > > > > > ltt-armall
> > > > > > > > > lttctl -C -w /tmp/trace1 trace1
> > > > > > > > > lttctl -D trace1
> > > > > > > > > lttv -m textDump -t /tmp/trace1
> > > > > > > > **
> > > > > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion failed:
> > > > (info !=
> > > > > > > NULL)
> > > > > > > > /usr/bin/lttv: line 15:  2696 Aborted                 $0.real
> > $*
> > > > > > > >
> > > > > > > > This post,
> > > > > > >
> > > >
> > http://lists.casi.polymtl.ca/pipermail/ltt-dev/2010-February/002189.html,
> > > > > > > > describes the same error on 32 bit system. By not having the
> > > > > > > > LTT_ALIGNMENT set, shouldn't it just work in my case?
> > > > > > > >
> > > > > > > > Thanks in advance for your help!
> > > > > > >
> > > > > > > A few questions:
> > > > > > >
> > > > > > > - the trace is taken on a ppc64 machine. Is is viewed on a ppc64
> > too,
> > > > or
> > > > > > >  on a different arch size/endianness ?
> > > > > > >
> > > > > >
> > > > > > Trace collected and viewed on the same ppc64 system.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > - Can you provide some information about the content of the trace
> > > > > > >  directory ? (files, file size)
> > > > > > >
> > > > > >
> > > > > > The files in the trace directory are all the same size??
> > > > > >
> > > > > > > /tmp/trace3 # ls -l
> > > > > > total 2448
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_0
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_1
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_2
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_3
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_0
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_1
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_2
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_3
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_0
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_1
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_2
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_3
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_0
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_1
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_2
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_3
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_0
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_1
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_2
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_3
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_0
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_1
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_2
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_3
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_0
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_1
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_2
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_3
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_0
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_1
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_2
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_3
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_0
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_1
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_2
> > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_3
> > > > > >
> > > > >
> > > > > Oh, that's a pretty good hint. Although I tried to make everything as
> > > > > generic as possible, I think LTTV assumes that the page size is 4k.
> > > > > However, recent ppc64 kernels have a 64k page size. So I'm fairly
> > sure
> > > > > that this is where the problem comes from.
> > > > >
> > > > > Can you send me (privately) a tarball of your trace so I can see what
> > I
> > > > > can do with LTTV to support it ?
> > > >
> > > > OK, got it. Well, your metadata channels are all empty (looked with
> > > > hexedit). So clearly LTTV cannot make sense of the data in the other
> > > > channels.
> > > >
> > > > So let's try the simple solutions first: try to load all the
> > > > LTTng-related modules you have (I see that a few were missing from your
> > > > list), and only then run ltt-armall. If that does not work, try to
> > > > compile LTTng as "built in".
> > > >
> > >
> > > Tried building ltt into the kernel and still seeing the assertion
> > failed...
> > > > lttv -m textDump -t /tmp/trace7
> > > **
> > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion failed: (info !=
> > > NULL)
> > > /usr/bin/lttv: line 15:  2677 Aborted                 $0.real $*
> >
> >
> > I just looked at your trace5, and it looks like this one has empty
> > "metadata" channels too. So I am pretty confident that there is a
> > problem with 4k vs 64k alignment, but I cannot test the fix until we
> > find out why the metadata channel is empty.
> >
> > Can you give me the output of:
> >
> > # find /mnt/debugfs/ltt
> >
>  I attached the output mnt_debugfs_ltt_content because it is too long to
> list here
> 
> 
> >  ? (or with the appropriate location of debugfs on your system)
> >
> > Normally, we should have the metadata generated by events in:
> >
> > kernel/marker.c: ltt_dump_marker_state()
> >
> > So double-checking that this symbol is present in your kernel might
> > help. Also making sure that CONFIG_MARKERS is set would be good. To
> > confirm that ltt_dump_marker_state() is called, you could add some
> > printk() in that function to see if LTTng is calling it at it should.
> >
> 
> CONFIG_MARKERS is set. I added a printk to ltt_dump_marker_state and saw
> that the function is being called. This is the log from /var/log/messages:
> 
> Apr  2 20:06:37 elm17d191 kernel: LTT dump marker  state is called ### My
> printk statement #####
> Apr  2 20:06:37 elm17d191 kernel:
> Apr  2 20:06:37 elm17d191 kernel: LTT state dump begin
> Apr  2 20:06:37 elm17d191 kernel: LTT state dump thread start
> Apr  2 20:06:37 elm17d191 kernel: LTT state dump end
> Apr  2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel
> softirq_state (cpu 0, index 0)
> Apr  2 20:07:34 elm17d191 kernel: LTT: 1 events written in channel
> global_state (cpu 0, index 0)
> Apr  2 20:07:34 elm17d191 kernel: LTT: 1890 events written in channel kernel
> (cpu 0, index 0)
> Apr  2 20:07:34 elm17d191 kernel: LTT: 525 events written in channel kernel
> (cpu 1, index 0)
> Apr  2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel kernel
> (cpu 2, index 0)
> Apr  2 20:07:34 elm17d191 kernel: LTT: 48 events written in channel kernel
> (cpu 3, index 0)
> Apr  2 20:07:34 elm17d191 kernel: LTT: 7 events written in channel fs (cpu
> 1, index 0)
> Apr  2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel block
> (cpu 0, index 0)
> Apr  2 20:07:34 elm17d191 kernel: LTT: 14 events written in channel block
> (cpu 2, index 0)
> 

Hrm, no event written in the metadata channel. Can you check if
CONFIG_IMMEDIATE is active ? If yes, try setting it to 'N'.

Thanks,

Mathieu

> 
> 
> >
> > Thanks,
> >
> > Mathieu
> >
> > >
> > > dany
> > >
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Mathieu
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mathieu
> > > > >
> > > > > >
> > > > > > >
> > > > > > > Also, trying to enable "CONFIG_LTT_DEBUG_EVENT_SIZE" might help
> > by
> > > > > > > providing extra self-check on trace metadata consistency.
> > > > > > >
> > > > > > > Rebuilt the kernel with CONFIG_LTT_DEBUG_EVENT_SIZE=y ,
> > recollected
> > > > the
> > > > > > trace and viewed it. I still see the same error and did notice
> > anything
> > > > else
> > > > > > different than it was before.
> > > > > >
> > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Mathieu
> > > > > > >
> > > > > > > >
> > > > > > > > Dany Madden
> > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > ltt-dev mailing list
> > > > > > > > ltt-dev at lists.casi.polymtl.ca
> > > > > > > > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Mathieu Desnoyers
> > > > > > > Operating System Efficiency R&D Consultant
> > > > > > > EfficiOS Inc.
> > > > > > > http://www.efficios.com
> > > > > > >
> > > > >
> > > > > --
> > > > > Mathieu Desnoyers
> > > > > Operating System Efficiency R&D Consultant
> > > > > EfficiOS Inc.
> > > > > http://www.efficios.com
> > > >
> > > > --
> > > > Mathieu Desnoyers
> > > > Operating System Efficiency R&D Consultant
> > > > EfficiOS Inc.
> > > > http://www.efficios.com
> > > >
> >
> > > _______________________________________________
> > > ltt-dev mailing list
> > > ltt-dev at lists.casi.polymtl.ca
> > > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> >
> >
> > --
> > Mathieu Desnoyers
> > Operating System Efficiency R&D Consultant
> > EfficiOS Inc.
> > http://www.efficios.com
> >



-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list