[ltt-dev] Kernel Oops during lttctl_start()

Mathieu Desnoyers compudj at krystal.dyndns.org
Mon Apr 18 21:51:18 EDT 2011


* Bernd Hufmann (Bernd.Hufmann at ericsson.com) wrote:
> Hello
>
> I was using the LTTng agent to control LTTng Kernel tracing. Everything  
> worked fine until I tried to disable certain channels (command  
> setChannelEnable()). Interestingly, this command doesn't fail, however  
> the sub-subsequent command startTrace() fails. I did some debugging of  
> the LTTng agent to investigate the problem. I'm able to pin-point the  
> last line of execution, which is in the library liblttctrl. There is a  
> crash and the tcf-agent becomes a zombie process (defunc). When I  
> display kernel messages with the shell command dmesg I see a kernel  
> oops. After the error occurred, the remote machine has to be rebooted.  
> I'm not able to debug further and need some support.
>
> Could someone please look into the problem? Please let me know if you  
> have any questions.
>
> Steps to reproduce:
>     - Enable all markers (use ltt-armall)
>     - Start tcf-client
>     - connect <remote>
>     - tcf ltt_control setupTrace "kernel" "0" "h2"
>     - tcf ltt_control setChannelEnable "kernel" "0" "h2" false
>     - tcf ltt_control allocTrace "kernel" "0" "h2"
>     - lttctl_client ltt_control writeTraceLocal "kernel" "0" "h2"  
> "/tmp/h2" 2 false false false
>     - tcf ltt_control startTrace "kernel" "0" "h2"
>
> Now, the command startTrace won't return. It hangs on line "if  
> (write(fd, op, strlen(op)) == -1)" of method "lttctl_sendop()" which was  
> called by method "lttctl_start()" (see file liblttctrl.c). Please see  
> attached file "dmesg_start.log" for the dmesg output.
>
[...]
> [  416.740678] LTT : Tracing not active for trace h5
> [  416.740908] LTT state dump begin
> [  416.740931] LTT state dump thread start
> [  416.743693] LTT state dump end
> [  447.956279] LTT: 36 events written in channel metadata (cpu 0, index 0)
> [  447.956424] LTT: 54278 events written in channel fs (cpu 0, index 0)
> [  447.956468] LTT: 40989 events written in channel fs (cpu 0, index 1)
> [  618.382904] LTT : Tracing not active for trace h6
> [  618.383350] BUG: unable to handle kernel NULL pointer dereference at 00000014
> [  618.383983] IP: [<e080828f>] ltt_trace_start+0x7f/0x190 [ltt_tracer]

This looks like a bug in lttng-modules (the LTTng kernel modules).
Probably a bug with the way the current LTTng-stable versions handle the
trace session "templates". The good news is that we are getting rid of
all that code in the upcoming LTTng (with UST/ltt-sessiond/CTF
integration), so this bug is very likely to vanish then.

I'll try to have a look.

Thanks for reporting this.

Mathieu

> [  618.384608] *pde = 1ea70067 *pte = 00000000 
> [  618.384806] Oops: 0000 [#1] SMP 
> [  618.384869] last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/PNP0C0A:00/power_supply/BAT0/voltage_now
> [  618.385050] Modules linked in: lp ltt_tracer serio_raw snd_seq_device joydev snd_page_alloc ltt_core net_trace hid soundcore vboxvideo ltt_marker_control i2c_piix4 vboxsf snd_rawmidi agpgart fs_trace e1000 libahci snd_timer ltt_trace_control binfmt_misc ipc_trace ahci psmouse snd_seq_midi usbhid ltt_filter jbd2_trace syscall_trace snd parport snd_seq kernel_trace trap_trace mm_trace ac97_bus snd_seq_midi_event ltt_statedump vboxguest ltt_relay parport_pc block_trace ppdev snd_pcm snd_ac97_codec snd_intel8x0 ltt_kprobes ltt_userspace_event rcu_trace drm
> [  618.386077] 
> [  618.386077] Pid: 1629, comm: tcf-agent Not tainted 2.6.35-24-lttng #37~lttng1-Ubuntu /VirtualBox
> [  618.386077] EIP: 0060:[<e080828f>] EFLAGS: 00010202 CPU: 0
> [  618.386077] EIP is at ltt_trace_start+0x7f/0x190 [ltt_tracer]
> [  618.386077] EAX: de1ec938 EBX: 00000002 ECX: 000000fc EDX: 00000000
> [  618.386077] ESI: 00000002 EDI: de1ec800 EBP: dea85f28 ESP: dea85f10
> [  618.386077]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> [  618.386077] Process tcf-agent (pid: 1629, ti=dea84000 task=dfbdcc20 task.ti=dea84000)
> [  618.386077] Stack:
> [  618.386077]  00000fff 00000001 df2a2400 00000001 00000000 dde23000 dea85f54 e088e87b
> [  618.386077] <0> dde23000 e088f0b9 de1ea000 def18f00 de1ea000 00000001 def18f00 00000001
> [  618.386077] <0> 006759a4 dea85f7c c021ce22 dea85f94 00000002 dea85f7c e088e7a0 00000008
> [  618.386077] Call Trace:
> [  618.386077]  [<e088e87b>] ? enabled_write+0xdb/0x164 [ltt_trace_control]
> [  618.386077]  [<c021ce22>] ? vfs_write+0xa2/0x190
> [  618.386077]  [<e088e7a0>] ? enabled_write+0x0/0x164 [ltt_trace_control]
> [  618.386077]  [<c021d7db>] ? sys_write+0x4b/0xc0
> [  618.386077]  [<c05cdf7c>] ? syscall_call+0x7/0xb
> [  618.386077] Code: 00 84 c0 0f 85 cd 00 00 00 8b 45 f0 8b 70 0c 8b 78 08 85 f6 74 21 31 c0 31 db 66 90 69 c0 38 01 00 00 83 c3 01 8d 04 07 8b 50 28 <8b> 52 14 ff 52 24 39 f3 89 d8 72 e5 8b 45 f0 c7 40 10 01 00 00 
> [  618.386077] EIP: [<e080828f>] ltt_trace_start+0x7f/0x190 [ltt_tracer] SS:ESP 0068:dea85f10
> [  618.386077] CR2: 0000000000000014
> [  618.400464] ---[ end trace 3bdb19f1a660263a ]---

> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list