[ltt-dev] Kernel Oops during lttctl_start()
Mathieu Desnoyers
compudj at krystal.dyndns.org
Mon Apr 18 21:51:18 EDT 2011
* Bernd Hufmann (Bernd.Hufmann at ericsson.com) wrote:
> Hello
>
> I was using the LTTng agent to control LTTng Kernel tracing. Everything
> worked fine until I tried to disable certain channels (command
> setChannelEnable()). Interestingly, this command doesn't fail, however
> the sub-subsequent command startTrace() fails. I did some debugging of
> the LTTng agent to investigate the problem. I'm able to pin-point the
> last line of execution, which is in the library liblttctrl. There is a
> crash and the tcf-agent becomes a zombie process (defunc). When I
> display kernel messages with the shell command dmesg I see a kernel
> oops. After the error occurred, the remote machine has to be rebooted.
> I'm not able to debug further and need some support.
>
> Could someone please look into the problem? Please let me know if you
> have any questions.
>
> Steps to reproduce:
> - Enable all markers (use ltt-armall)
> - Start tcf-client
> - connect <remote>
> - tcf ltt_control setupTrace "kernel" "0" "h2"
> - tcf ltt_control setChannelEnable "kernel" "0" "h2" false
> - tcf ltt_control allocTrace "kernel" "0" "h2"
> - lttctl_client ltt_control writeTraceLocal "kernel" "0" "h2"
> "/tmp/h2" 2 false false false
> - tcf ltt_control startTrace "kernel" "0" "h2"
>
> Now, the command startTrace won't return. It hangs on line "if
> (write(fd, op, strlen(op)) == -1)" of method "lttctl_sendop()" which was
> called by method "lttctl_start()" (see file liblttctrl.c). Please see
> attached file "dmesg_start.log" for the dmesg output.
>
[...]
> [ 416.740678] LTT : Tracing not active for trace h5
> [ 416.740908] LTT state dump begin
> [ 416.740931] LTT state dump thread start
> [ 416.743693] LTT state dump end
> [ 447.956279] LTT: 36 events written in channel metadata (cpu 0, index 0)
> [ 447.956424] LTT: 54278 events written in channel fs (cpu 0, index 0)
> [ 447.956468] LTT: 40989 events written in channel fs (cpu 0, index 1)
> [ 618.382904] LTT : Tracing not active for trace h6
> [ 618.383350] BUG: unable to handle kernel NULL pointer dereference at 00000014
> [ 618.383983] IP: [<e080828f>] ltt_trace_start+0x7f/0x190 [ltt_tracer]
This looks like a bug in lttng-modules (the LTTng kernel modules).
Probably a bug with the way the current LTTng-stable versions handle the
trace session "templates". The good news is that we are getting rid of
all that code in the upcoming LTTng (with UST/ltt-sessiond/CTF
integration), so this bug is very likely to vanish then.
I'll try to have a look.
Thanks for reporting this.
Mathieu
> [ 618.384608] *pde = 1ea70067 *pte = 00000000
> [ 618.384806] Oops: 0000 [#1] SMP
> [ 618.384869] last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/PNP0C0A:00/power_supply/BAT0/voltage_now
> [ 618.385050] Modules linked in: lp ltt_tracer serio_raw snd_seq_device joydev snd_page_alloc ltt_core net_trace hid soundcore vboxvideo ltt_marker_control i2c_piix4 vboxsf snd_rawmidi agpgart fs_trace e1000 libahci snd_timer ltt_trace_control binfmt_misc ipc_trace ahci psmouse snd_seq_midi usbhid ltt_filter jbd2_trace syscall_trace snd parport snd_seq kernel_trace trap_trace mm_trace ac97_bus snd_seq_midi_event ltt_statedump vboxguest ltt_relay parport_pc block_trace ppdev snd_pcm snd_ac97_codec snd_intel8x0 ltt_kprobes ltt_userspace_event rcu_trace drm
> [ 618.386077]
> [ 618.386077] Pid: 1629, comm: tcf-agent Not tainted 2.6.35-24-lttng #37~lttng1-Ubuntu /VirtualBox
> [ 618.386077] EIP: 0060:[<e080828f>] EFLAGS: 00010202 CPU: 0
> [ 618.386077] EIP is at ltt_trace_start+0x7f/0x190 [ltt_tracer]
> [ 618.386077] EAX: de1ec938 EBX: 00000002 ECX: 000000fc EDX: 00000000
> [ 618.386077] ESI: 00000002 EDI: de1ec800 EBP: dea85f28 ESP: dea85f10
> [ 618.386077] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> [ 618.386077] Process tcf-agent (pid: 1629, ti=dea84000 task=dfbdcc20 task.ti=dea84000)
> [ 618.386077] Stack:
> [ 618.386077] 00000fff 00000001 df2a2400 00000001 00000000 dde23000 dea85f54 e088e87b
> [ 618.386077] <0> dde23000 e088f0b9 de1ea000 def18f00 de1ea000 00000001 def18f00 00000001
> [ 618.386077] <0> 006759a4 dea85f7c c021ce22 dea85f94 00000002 dea85f7c e088e7a0 00000008
> [ 618.386077] Call Trace:
> [ 618.386077] [<e088e87b>] ? enabled_write+0xdb/0x164 [ltt_trace_control]
> [ 618.386077] [<c021ce22>] ? vfs_write+0xa2/0x190
> [ 618.386077] [<e088e7a0>] ? enabled_write+0x0/0x164 [ltt_trace_control]
> [ 618.386077] [<c021d7db>] ? sys_write+0x4b/0xc0
> [ 618.386077] [<c05cdf7c>] ? syscall_call+0x7/0xb
> [ 618.386077] Code: 00 84 c0 0f 85 cd 00 00 00 8b 45 f0 8b 70 0c 8b 78 08 85 f6 74 21 31 c0 31 db 66 90 69 c0 38 01 00 00 83 c3 01 8d 04 07 8b 50 28 <8b> 52 14 ff 52 24 39 f3 89 d8 72 e5 8b 45 f0 c7 40 10 01 00 00
> [ 618.386077] EIP: [<e080828f>] ltt_trace_start+0x7f/0x190 [ltt_tracer] SS:ESP 0068:dea85f10
> [ 618.386077] CR2: 0000000000000014
> [ 618.400464] ---[ end trace 3bdb19f1a660263a ]---
> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list