[lttng-dev] Babeltrace Buffer Warning
Salman Rafiq
salman.rafiq at esk.fraunhofer.de
Wed Apr 18 11:25:05 EDT 2012
Hello Mathieu,
By looking into dmesg, I am not sure if it was page allocator bug or out-of-memory at that particular time. Does dmesg holds kernel messages from previous boot?
Because the system was rebooted after the error occurred. Anyway I have attached the output of dmesg.
Below is the rest of the information you asked for:
Architecture: x86_64
Kernel Version: 3.0.0
Kernel .Config output also attached.
Regards,
S.
--
Salman Rafiq
Industrial Communication
Fraunhofer-Einrichtung für Systeme der Kommunikationstechnik ESK
Hansastraße 32 | 80686 München
Telefon, Fax: +49 89 547088-356 | +49 89 547088-66-356
E-Mail: salman.rafiq at esk.fraunhofer.de
Internet:
http://www.esk.fraunhofer.de
http://www.facebook.com/FraunhoferESK
http://www.twitter.com/FraunhoferESK
-----Ursprüngliche Nachricht-----
Von: Mathieu Desnoyers [mailto:mathieu.desnoyers at efficios.com]
Gesendet: Wednesday, April 18, 2012 3:55 PM
An: Salman Rafiq
Cc: Oestman, Fredrik; lttng-dev at lists.lttng.org
Betreff: Re: AW: [lttng-dev] Babeltrace Buffer Warning
* Salman Rafiq (salman.rafiq at esk.fraunhofer.de) wrote:
> Hello All,
>
> Increasing the number of buffers and buffer size did helped indeed. Thanks to both for the advice.
>
> Another question is, what is the maximum --subbuf-size & --num-subbuf one can go? I guess it should depend upon your memory size?
>
> I am asking this because I had trouble last time, where my system crashed with some "general protection fault" while I had --subbuf-size of 2^29 . Attached are the syslog messages.
Trying it on my x86_64 machine:
lttng create
lttng enable-channel test --subbuf-size 536870912 -k
PERROR: ioctl kernel create channel: Invalid argument [in kernel_create_channel() at kernel.c:156]
Error: Channel test: Kernel create channel failed (session auto-20120418-095035)
Warning: Some command(s) went wrong
dmesg tells me:
[5408863.022967] lttng-sessiond: page allocation failure: order:10, mode:0xc0d0
[5408863.044885] Pid: 24301, comm: lttng-sessiond Tainted: G O 3.2.6-trace-test #129
[5408863.070423] Call Trace:
[5408863.078337] [<ffffffff810d1be6>] warn_alloc_failed+0xf6/0x150 [5408863.096397] [<ffffffff810d48a0>] ? page_alloc_cpu_notify+0x60/0x60 [5408863.115795] [<ffffffff810d4061>] __alloc_pages_nodemask+0x4d1/0x6d0
[5408863.135420] [<ffffffff810d4277>] __get_free_pages+0x17/0x50 [5408863.152971] [<ffffffff8110919f>] kmalloc_order_trace+0x3f/0x100 [5408863.171564] [<ffffffff81109d8e>] ? __kmalloc+0x1de/0x210 [5408863.188322] [<ffffffff81109d8e>] __kmalloc+0x1de/0x210 [5408863.204577] [<ffffffffa3991cc6>] lib_ring_buffer_backend_create+0x1b6/0x500 [lttng_lib_ring_buffer] [5408863.232493] [<ffffffff810e9dbb>] ? pcpu_next_pop+0x4b/0x70 [5408863.249776] [<ffffffffa3994f6d>] lib_ring_buffer_create+0xad/0x2e0 [lttng_lib_ring_buffer] [5408863.275386] [<ffffffffa3992442>] channel_backend_init+0x232/0x430 [lttng_lib_ring_buffer] [5408863.300738] [<ffffffffa3993184>] channel_create+0x94/0x220 [lttng_lib_ring_buffer] [5408863.324265] [<ffffffff81109693>] ? kmem_cache_alloc_trace+0x133/0x150
[5408863.344374] [<ffffffffa002d723>] _channel_create+0x33/0x40 [lttng_ring_buffer_client_discard]
[5408863.370723] [<ffffffffa3a077aa>] lttng_channel_create+0x14a/0x2c0 [lttng_tracer] [5408863.393678] [<ffffffffa3a0912f>] lttng_abi_create_channel+0xcf/0x220 [lttng_tracer] [5408863.417413] [<ffffffffa3a0985a>] lttng_session_ioctl+0x6a/0xa0 [lttng_tracer] [5408863.439589] [<ffffffff8111eddf>] do_vfs_ioctl+0x8f/0x500 [5408863.456299] [<ffffffff8110e4b2>] ? fget_light+0x92/0x100 [5408863.473019] [<ffffffff8111f2e1>] sys_ioctl+0x91/0xa0 [5408863.488699] [<ffffffff815890eb>] system_call_fastpath+0x16/0x1b
This is on a 3.2.6 kernel.
Which architecture and kernel version are you using ? You might be hitting a kernel page allocator bug. Also, what is the content of your kernel .config ?
Thanks,
Mathieu
>
>
> Regards,
> S.
>
> --
> Salman Rafiq
> Industrial Communication
> Fraunhofer-Einrichtung für Systeme der Kommunikationstechnik ESK
>
> Hansastraße 32 | 80686 München
> Telefon, Fax: +49 89 547088-356 | +49 89 547088-66-356
> E-Mail: salman.rafiq at esk.fraunhofer.de
>
> Internet:
> http://www.esk.fraunhofer.de
> http://www.facebook.com/FraunhoferESK
> http://www.twitter.com/FraunhoferESK
>
>
> -----Ursprüngliche Nachricht-----
> Von: Mathieu Desnoyers [mailto:mathieu.desnoyers at efficios.com]
> Gesendet: Monday, April 16, 2012 4:33 PM
> An: Oestman, Fredrik
> Cc: lttng-dev at lists.lttng.org; Salman Rafiq
> Betreff: Re: [lttng-dev] Babeltrace Buffer Warning
>
> * Oestman, Fredrik (Fredrik_Oestman at mentor.com) wrote:
> > Mathieu Desnoyers wrote:
> > > The warning message is:
> > >
> > > "[warning] Tracer discarded 280625 events between
> > > [1334315833.578351009] and [1334315833.578351380]. You should
> > > consider increasing the buffer size."
> > >
> > > which states that the tracer did discard the events (not the
> > > converter).
> > > But maybe I'm missing your point. What would you recommend to make
> > > this message clearer ?
> >
> > I actually had no recommendation or suggestion in mind. To me the message is clear. I just got the impression that the OP may have been confused in this respect for some reason.
> >
> > But since you ask, what about something along the lines of:
> >
> > "[warning] The tracer discarded 280625 events between
> > [1334315833.578351009] and [1334315833.578351380]. You should
> > consider making a new trace with larger buffers or with fewer event types enabled."
>
> Good idea. Fixed in commit:
>
> commit a0b34fbba498a8a447357adf5920ac2c8a22f026
> Author: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
> Date: Mon Apr 16 10:33:55 2012 -0400
>
> Clarify warning message for events discarded
>
> Reported-by: Fredrik Oestman <Fredrik_Oestman at mentor.com>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
>
> Thanks!
>
> Mathieu
>
>
> >
> >
> > Cheers,
> >
> > Fredrik Östman
> >
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant EfficiOS Inc.
> http://www.efficios.com
Content-Description: command_line.txt
> Message from syslogd at oje at Apr 16 17:29:27 ...
> kernel:[2777781.860828] general protection fault: 0000 [#1] SMP
>
> Message from syslogd at oje at Apr 16 17:29:27 ...
> kernel:[2777781.864740] Stack:
>
> Message from syslogd at oje at Apr 16 17:29:27 ...
> kernel:[2777781.864740] Call Trace:
>
> Message from syslogd at oje at Apr 16 17:29:27 ...
> kernel:[2777781.864740] Code: 48 8b 7b 08 48 89 c5 48 8d 43 08 48 39
> c7 75 05 ff 43 04 eb 05 e8 10 e1 27 00 48 89 df 48 89 ee 5b 5b 5d e9
> b5 e6 27 00 90 eb 0c <8b> 50 10 39 56 10 7f 0c 48 8d 78 08 48 8b 07 48
> 85 c0 75 ec 48
Content-Description: syslogd.txt
> Apr 16 17:29:27 oje kernel: [2777781.864435] CPU 4 Apr 16 17:29:27 oje
> kernel: [2777781.864466] Modules linked in: lttng_probe_timer(O)
> lttng_probe_statedump(O) lttng_probe_signal(O) lttng_probe_sched(O)
> lttng_probe_kvm(O) lttng_probe_irq(O) lttng_probe_block(O) lttng_types(O) lttng_probe_lttng(O) lttng_ring_buffer_metadata_mmap_client(O) lttng_ring_buffer_client_mmap_overwrite(O) lttng_ring_buffer_client_mmap_discard(O) lttng_ring_buffer_metadata_client(O) lttng_ring_buffer_client_overwrite(O) lttng_ring_buffer_client_discard(O) lttng_tracer(O) lttng_kretprobes(O) lttng_kprobes(O) lttng_statedump(O) lttng_lib_ring_buffer(O) bnep rfcomm bluetooth rfkill crc16 parport_pc ppdev lp parport binfmt_misc fuse nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc loop radeon joydev ttm drm_kms_helper drm edac_core shpchp nv_tco psmouse i2c_nforce2 i2c_algo_bit power_supply pci_hotplug pcspkr edac_mce_amd i2c_core evdev k10temp serio_raw ext3 jbd mbcache sg sr_mod sd_mod cdrom crc_t10dif usbhid hid ata_generic pata_amd ohci_hcd sata_nv libata ehci_hcd igb e1000 scsi_mod Apr 16 17:29:27 oje kernel: usbcore dca [last unloaded: lttng_kretprobes] Apr 16 17:29:27 oje kernel: [2777781.864740]
> Apr 16 17:29:27 oje kernel: [2777781.864740] Pid: 5746, comm: lttng-sessiond Tainted: G W O 3.0.0opteron #4 Supermicro H8QM3/H8QM3
> Apr 16 17:29:27 oje kernel: [2777781.864740] RIP:
> 0010:[<ffffffff8105ac32>] [<ffffffff8105ac32>]
> notifier_chain_register+0x2/0x20 Apr 16 17:29:27 oje kernel:
> [2777781.864740] RSP: 0018:ffff880077083bc0 EFLAGS: 00010282 Apr 16
> 17:29:27 oje kernel: [2777781.864740] RAX: a8c00e1ea8c0f2e6 RBX:
> ffff8800bb113c50 RCX: 0000000000000010 Apr 16 17:29:27 oje kernel:
> [2777781.864740] RDX: 0000000001082e40 RSI: ffff8800bb113c50 RDI:
> ffff880030ffb858 Apr 16 17:29:27 oje kernel: [2777781.864740] RBP:
> ffffffffa03b5270 R08: 0000000000000200 R09: ffffffff8167eff0 Apr 16
> 17:29:27 oje kernel: [2777781.864740] R10: 00000000000000c2 R11:
> 00000000000000c2 R12: 00000000ffffffff Apr 16 17:29:27 oje kernel:
> [2777781.864740] R13: ffff8801273fbc80 R14: 0000000010000000 R15:
> 0000000000000010 Apr 16 17:29:27 oje kernel: [2777781.864740] FS:
> 00007f7bd96a9700(0000) GS:ffff88012fc00000(0000)
> knlGS:0000000000000000 Apr 16 17:29:27 oje kernel: [2777781.864740]
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 16 17:29:27 oje
> kernel: [2777781.864740] CR2: 00000000023332b0 CR3: 000000012d04b000
> CR4: 00000000000006e0 Apr 16 17:29:27 oje kernel: [2777781.864740]
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr
> 16 17:29:27 oje kernel: [2777781.864740] DR3: 0000000000000000 DR6:
> 00000000ffff0ff0 DR7: 0000000000000400 Apr 16 17:29:27 oje kernel:
> [2777781.864740] Process lttng-sessiond (pid: 5746, threadinfo
> ffff880077082000, task ffff8800772dc770) Apr 16 17:29:27 oje kernel:
> [2777781.864740] ffffffff812c0ee0 ffff8801273fbc80 0000000010000000
> ffff8800bb113c10 Apr 16 17:29:27 oje kernel: [2777781.864740]
> ffffffffa03a00e3 0000000000000246 0000001010000000 ffff8801273fbc80
> Apr 16 17:29:27 oje kernel: [2777781.864740] ffff8800bb113c00
> ffffffffa03b5270 ffffffffa0411188 ffff8800bb113c10 Apr 16 17:29:27 oje
> kernel: [2777781.864740] [<ffffffff812c0ee0>] ?
> register_cpu_notifier+0x1c/0x2f Apr 16 17:29:27 oje kernel:
> [2777781.864740] [<ffffffffa03a00e3>] ?
> channel_backend_init+0x189/0x2a0 [lttng_lib_ring_buffer] Apr 16
> 17:29:27 oje kernel: [2777781.864740] [<ffffffffa03a06ac>] ?
> channel_create+0x76/0x1ce [lttng_lib_ring_buffer] Apr 16 17:29:27 oje
> kernel: [2777781.864740] [<ffffffffa03b44ac>] ?
> _channel_create+0x2b/0x30 [lttng_ring_buffer_client_discard]
> Apr 16 17:29:27 oje kernel: [2777781.864740] [<ffffffffa03ce622>] ?
> lttng_channel_create+0xeb/0x165 [lttng_tracer] Apr 16 17:29:27 oje
> kernel: [2777781.864740] [<ffffffffa03cf734>] ?
> lttng_abi_create_channel+0x122/0x1c5 [lttng_tracer] Apr 16 17:29:27
> oje kernel: [2777781.864740] [<ffffffff810c7172>] ?
> page_move_anon_rmap+0x1c/0x28 Apr 16 17:29:27 oje kernel:
> [2777781.864740] [<ffffffffa03cfbdd>] ? lttng_session_ioctl+0x64/0x71
> [lttng_tracer] Apr 16 17:29:27 oje kernel: [2777781.864740]
> [<ffffffff810f4455>] ? do_vfs_ioctl+0x45c/0x49d Apr 16 17:29:27 oje
> kernel: [2777781.864740] [<ffffffff810f71d8>] ? dput+0xd8/0xe4 Apr 16
> 17:29:27 oje kernel: [2777781.864740] [<ffffffff810fbd02>] ?
> cpumask_next+0x16/0x18 Apr 16 17:29:27 oje kernel: [2777781.864740]
> [<ffffffff810fcc74>] ? mnt_get_count+0x32/0x41 Apr 16 17:29:27 oje
> kernel: [2777781.864740] [<ffffffff810f44e1>] ? sys_ioctl+0x4b/0x72
> Apr 16 17:29:27 oje kernel: [2777781.864740] [<ffffffff812de292>] ?
> system_call_fastpath+0x16/0x1b Apr 16 17:29:27 oje kernel:
> [2777781.864740] RSP <ffff880077083bc0> Apr 16 17:29:27 oje kernel:
> [2777782.091016] ---[ end trace fcc54233a1dc458e ]---
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant EfficiOS Inc.
http://www.efficios.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmessage.txt
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20120418/2d692563/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: config_output.txt
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20120418/2d692563/attachment-0003.txt>
More information about the lttng-dev
mailing list