[ltt-dev] lttv crashed in tracefile.c:1582

Mathieu Desnoyers compudj at krystal.dyndns.org
Fri Apr 2 20:33:12 EDT 2010


* Dany Madden (dany4madden at gmail.com) wrote:
> On Fri, Apr 2, 2010 at 1:49 PM, Mathieu Desnoyers <
> compudj at krystal.dyndns.org> wrote:
> 
> > * Dany Madden (dany4madden at gmail.com) wrote:
> > > On Fri, Apr 2, 2010 at 7:34 AM, Mathieu Desnoyers <
> > > compudj at krystal.dyndns.org> wrote:
> > >
> > > > * Dany Madden (dany4madden at gmail.com) wrote:
> > > > > On Wed, Mar 31, 2010 at 1:56 PM, Mathieu Desnoyers <
> > > > > compudj at krystal.dyndns.org> wrote:
> > > > >
> > > > > > * Mathieu Desnoyers (compudj at krystal.dyndns.org) wrote:
> > > > > > > * Dany Madden (dany4madden at gmail.com) wrote:
> > > > > > > > On Wed, Mar 31, 2010 at 10:24 AM, Mathieu Desnoyers <
> > > > > > > > compudj at krystal.dyndns.org> wrote:
> > > > > > > >
> > > > > > > > > * Dany Madden (dany4madden at gmail.com) wrote:
> > > > > > > > > > Hello,
> > > > > > > > > >
> > > > > > > > > > Please take a look at this lttv error with the mainline
> > kernel
> > > > > > > > > > 2.6.32.9 on ppc64.
> > > > > > > > > >
> > > > > > > > > > lttng: patch-2.6.32.9-lttng-0.198
> > > > > > > > > > lttv-0.12.30-02102010
> > > > > > > > > > ltt-control-0.79-01022010
> > > > > > > > > >
> > > > > > > > > > LTTV crashed with this error:
> > > > > > > > > >
> > > > > > > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion
> > failed:
> > > > > > (info !=
> > > > > > > > > NULL)
> > > > > > > > > > /usr/bin/lttv: line 15:  2696 Aborted
> > $0.real
> > > > $*
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > CONFIG_LTT=y
> > > > > > > > > > CONFIG_LTT_FILTER=m
> > > > > > > > > > # CONFIG_HAVE_LTT_DUMP_TABLES is not set
> > > > > > > > > > CONFIG_LTT_RELAY=y
> > > > > > > > > > CONFIG_LTT_RELAY_LOCKLESS=y
> > > > > > > > > > # CONFIG_LTT_RELAY_IRQOFF is not set
> > > > > > > > > > # CONFIG_LTT_RELAY_LOCKED is not set
> > > > > > > > > > CONFIG_LTT_SERIALIZE=y
> > > > > > > > > > CONFIG_LTT_FAST_SERIALIZE=y
> > > > > > > > > > CONFIG_LTT_TRACEPROBES=m
> > > > > > > > > > CONFIG_LTT_TRACE_CONTROL=m
> > > > > > > > > > CONFIG_LTT_TRACER=y
> > > > > > > > > > # CONFIG_LTT_ALIGNMENT is not set
> > > > > > > > > > CONFIG_LTT_CHECK_ARCH_EFFICIENT_UNALIGNED_ACCESS=y
> > > > > > > > > > # CONFIG_LTT_DEBUG_EVENT_SIZE is not set
> > > > > > > > > > CONFIG_LTT_USERSPACE_EVENT=m
> > > > > > > > > > CONFIG_LTT_VMCORE=y
> > > > > > > > > > CONFIG_LTT_STATEDUMP=m
> > > > > > > > > > CONFIG_LTT_KPROBES=m
> > > > > > > > > > # CONFIG_LTT_ASCII is not set
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > modprobe ltt-trace-control
> > > > > > > > > > > modprobe ipc-trace
> > > > > > > > > > > modprobe kernel-trace
> > > > > > > > > > > modprobe mm-trace
> > > > > > > > > > > modprobe fs-trace
> > > > > > > > > > > modprobe syscall-trace
> > > > > > > > > > > modprobe trap-trace
> > > > > > > > > >
> > > > > > > > > > > ltt-armall
> > > > > > > > > > > lttctl -C -w /tmp/trace1 trace1
> > > > > > > > > > > lttctl -D trace1
> > > > > > > > > > > lttv -m textDump -t /tmp/trace1
> > > > > > > > > > **
> > > > > > > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion
> > failed:
> > > > > > (info !=
> > > > > > > > > NULL)
> > > > > > > > > > /usr/bin/lttv: line 15:  2696 Aborted
> > $0.real
> > > > $*
> > > > > > > > > >
> > > > > > > > > > This post,
> > > > > > > > >
> > > > > >
> > > >
> > http://lists.casi.polymtl.ca/pipermail/ltt-dev/2010-February/002189.html,
> > > > > > > > > > describes the same error on 32 bit system. By not having
> > the
> > > > > > > > > > LTT_ALIGNMENT set, shouldn't it just work in my case?
> > > > > > > > > >
> > > > > > > > > > Thanks in advance for your help!
> > > > > > > > >
> > > > > > > > > A few questions:
> > > > > > > > >
> > > > > > > > > - the trace is taken on a ppc64 machine. Is is viewed on a
> > ppc64
> > > > too,
> > > > > > or
> > > > > > > > >  on a different arch size/endianness ?
> > > > > > > > >
> > > > > > > >
> > > > > > > > Trace collected and viewed on the same ppc64 system.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > - Can you provide some information about the content of the
> > trace
> > > > > > > > >  directory ? (files, file size)
> > > > > > > > >
> > > > > > > >
> > > > > > > > The files in the trace directory are all the same size??
> > > > > > > >
> > > > > > > > > /tmp/trace3 # ls -l
> > > > > > > > total 2448
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_0
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_1
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_2
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_3
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_0
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_1
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_2
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_3
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_0
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_1
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_2
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_3
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_0
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_1
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_2
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_3
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_0
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_1
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_2
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_3
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_0
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_1
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_2
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_3
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_0
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_1
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_2
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_3
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_0
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_1
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_2
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_3
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_0
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_1
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_2
> > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_3
> > > > > > > >
> > > > > > >
> > > > > > > Oh, that's a pretty good hint. Although I tried to make
> > everything as
> > > > > > > generic as possible, I think LTTV assumes that the page size is
> > 4k.
> > > > > > > However, recent ppc64 kernels have a 64k page size. So I'm fairly
> > > > sure
> > > > > > > that this is where the problem comes from.
> > > > > > >
> > > > > > > Can you send me (privately) a tarball of your trace so I can see
> > what
> > > > I
> > > > > > > can do with LTTV to support it ?
> > > > > >
> > > > > > OK, got it. Well, your metadata channels are all empty (looked with
> > > > > > hexedit). So clearly LTTV cannot make sense of the data in the
> > other
> > > > > > channels.
> > > > > >
> > > > > > So let's try the simple solutions first: try to load all the
> > > > > > LTTng-related modules you have (I see that a few were missing from
> > your
> > > > > > list), and only then run ltt-armall. If that does not work, try to
> > > > > > compile LTTng as "built in".
> > > > > >
> > > > >
> > > > > Tried building ltt into the kernel and still seeing the assertion
> > > > failed...
> > > > > > lttv -m textDump -t /tmp/trace7
> > > > > **
> > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion failed: (info
> > !=
> > > > > NULL)
> > > > > /usr/bin/lttv: line 15:  2677 Aborted                 $0.real $*
> > > >
> > > >
> > > > I just looked at your trace5, and it looks like this one has empty
> > > > "metadata" channels too. So I am pretty confident that there is a
> > > > problem with 4k vs 64k alignment, but I cannot test the fix until we
> > > > find out why the metadata channel is empty.
> > > >
> > > > Can you give me the output of:
> > > >
> > > > # find /mnt/debugfs/ltt
> > > >
> > >  I attached the output mnt_debugfs_ltt_content because it is too long to
> > > list here
> > >
> > >
> > > >  ? (or with the appropriate location of debugfs on your system)
> > > >
> > > > Normally, we should have the metadata generated by events in:
> > > >
> > > > kernel/marker.c: ltt_dump_marker_state()
> > > >
> > > > So double-checking that this symbol is present in your kernel might
> > > > help. Also making sure that CONFIG_MARKERS is set would be good. To
> > > > confirm that ltt_dump_marker_state() is called, you could add some
> > > > printk() in that function to see if LTTng is calling it at it should.
> > > >
> > >
> > > CONFIG_MARKERS is set. I added a printk to ltt_dump_marker_state and saw
> > > that the function is being called. This is the log from
> > /var/log/messages:
> > >
> > > Apr  2 20:06:37 elm17d191 kernel: LTT dump marker  state is called ### My
> > > printk statement #####
> > > Apr  2 20:06:37 elm17d191 kernel:
> > > Apr  2 20:06:37 elm17d191 kernel: LTT state dump begin
> > > Apr  2 20:06:37 elm17d191 kernel: LTT state dump thread start
> > > Apr  2 20:06:37 elm17d191 kernel: LTT state dump end
> > > Apr  2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel
> > > softirq_state (cpu 0, index 0)
> > > Apr  2 20:07:34 elm17d191 kernel: LTT: 1 events written in channel
> > > global_state (cpu 0, index 0)
> > > Apr  2 20:07:34 elm17d191 kernel: LTT: 1890 events written in channel
> > kernel
> > > (cpu 0, index 0)
> > > Apr  2 20:07:34 elm17d191 kernel: LTT: 525 events written in channel
> > kernel
> > > (cpu 1, index 0)
> > > Apr  2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel
> > kernel
> > > (cpu 2, index 0)
> > > Apr  2 20:07:34 elm17d191 kernel: LTT: 48 events written in channel
> > kernel
> > > (cpu 3, index 0)
> > > Apr  2 20:07:34 elm17d191 kernel: LTT: 7 events written in channel fs
> > (cpu
> > > 1, index 0)
> > > Apr  2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel block
> > > (cpu 0, index 0)
> > > Apr  2 20:07:34 elm17d191 kernel: LTT: 14 events written in channel block
> > > (cpu 2, index 0)
> > >
> >
> > Hrm, no event written in the metadata channel. Can you check if
> > CONFIG_IMMEDIATE is active ? If yes, try setting it to 'N'.
> >
> >
> Mathieu,
> 
> CONFIG_IMMEDIATE is set to y by default. I changed it to 'N' as you
> suggested and I do get the textDump now. I will need to check to see if
> disabling this CONFIG is acceptable.

Immediate values are a slight optimization over the "read, test and
branch" done by tracepoints by default. Newer gcc 4.5 will support asm
gotos, and so will the mainline kernel. It replaces the immediate
values. So, eventually, even though immediate values have been really
useful to show the merit of static jump patching (they served as
prototype for the asm goto requirements), they will be eventually
phased-out.

So I would tend to simply mark them as "BROKEN" in the next release for
powerpc rather than spend time trying to find out what subtle compiler
breakage we have here.

Thanks,

Mathieu

> 
> Thanks for the help!
> Dany
> 
> 
> > Thanks,
> >
> > Mathieu
> >
> > >
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Mathieu
> > > >
> > > > >
> > > > > dany
> > > > >
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Mathieu
> > > > > >
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Mathieu
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Also, trying to enable "CONFIG_LTT_DEBUG_EVENT_SIZE" might
> > help
> > > > by
> > > > > > > > > providing extra self-check on trace metadata consistency.
> > > > > > > > >
> > > > > > > > > Rebuilt the kernel with CONFIG_LTT_DEBUG_EVENT_SIZE=y ,
> > > > recollected
> > > > > > the
> > > > > > > > trace and viewed it. I still see the same error and did notice
> > > > anything
> > > > > > else
> > > > > > > > different than it was before.
> > > > > > > >
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Mathieu
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Dany Madden
> > > > > > > > >
> > > > > > > > > > _______________________________________________
> > > > > > > > > > ltt-dev mailing list
> > > > > > > > > > ltt-dev at lists.casi.polymtl.ca
> > > > > > > > > >
> > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Mathieu Desnoyers
> > > > > > > > > Operating System Efficiency R&D Consultant
> > > > > > > > > EfficiOS Inc.
> > > > > > > > > http://www.efficios.com
> > > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Mathieu Desnoyers
> > > > > > > Operating System Efficiency R&D Consultant
> > > > > > > EfficiOS Inc.
> > > > > > > http://www.efficios.com
> > > > > >
> > > > > > --
> > > > > > Mathieu Desnoyers
> > > > > > Operating System Efficiency R&D Consultant
> > > > > > EfficiOS Inc.
> > > > > > http://www.efficios.com
> > > > > >
> > > >
> > > > > _______________________________________________
> > > > > ltt-dev mailing list
> > > > > ltt-dev at lists.casi.polymtl.ca
> > > > > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> > > >
> > > >
> > > > --
> > > > Mathieu Desnoyers
> > > > Operating System Efficiency R&D Consultant
> > > > EfficiOS Inc.
> > > > http://www.efficios.com
> > > >
> >
> >
> >
> > --
> > Mathieu Desnoyers
> > Operating System Efficiency R&D Consultant
> > EfficiOS Inc.
> > http://www.efficios.com
> >

> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list