[lttng-dev] Babeltrace Buffer Warning

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Wed Apr 18 09:55:00 EDT 2012


* Salman Rafiq (salman.rafiq at esk.fraunhofer.de) wrote:
> Hello All,
> 
> Increasing the number of buffers and buffer size did helped indeed. Thanks to both for the advice.
> 
> Another question is, what is the maximum --subbuf-size & --num-subbuf one can go? I guess it should depend upon your memory size?
> 
> I am asking this because I had trouble last time, where my system crashed with some "general protection fault" while I had --subbuf-size of 2^29 . Attached are the syslog messages.

Trying it on my x86_64 machine:

lttng create
lttng enable-channel test --subbuf-size 536870912 -k
PERROR: ioctl kernel create channel: Invalid argument [in kernel_create_channel() at kernel.c:156]
Error: Channel test: Kernel create channel failed (session auto-20120418-095035)
Warning: Some command(s) went wrong

dmesg tells me:

[5408863.022967] lttng-sessiond: page allocation failure: order:10, mode:0xc0d0
[5408863.044885] Pid: 24301, comm: lttng-sessiond Tainted: G           O 3.2.6-trace-test #129
[5408863.070423] Call Trace:
[5408863.078337]  [<ffffffff810d1be6>] warn_alloc_failed+0xf6/0x150
[5408863.096397]  [<ffffffff810d48a0>] ? page_alloc_cpu_notify+0x60/0x60
[5408863.115795]  [<ffffffff810d4061>] __alloc_pages_nodemask+0x4d1/0x6d0
[5408863.135420]  [<ffffffff810d4277>] __get_free_pages+0x17/0x50
[5408863.152971]  [<ffffffff8110919f>] kmalloc_order_trace+0x3f/0x100
[5408863.171564]  [<ffffffff81109d8e>] ? __kmalloc+0x1de/0x210
[5408863.188322]  [<ffffffff81109d8e>] __kmalloc+0x1de/0x210
[5408863.204577]  [<ffffffffa3991cc6>] lib_ring_buffer_backend_create+0x1b6/0x500 [lttng_lib_ring_buffer]
[5408863.232493]  [<ffffffff810e9dbb>] ? pcpu_next_pop+0x4b/0x70
[5408863.249776]  [<ffffffffa3994f6d>] lib_ring_buffer_create+0xad/0x2e0 [lttng_lib_ring_buffer]
[5408863.275386]  [<ffffffffa3992442>] channel_backend_init+0x232/0x430 [lttng_lib_ring_buffer]
[5408863.300738]  [<ffffffffa3993184>] channel_create+0x94/0x220 [lttng_lib_ring_buffer]
[5408863.324265]  [<ffffffff81109693>] ? kmem_cache_alloc_trace+0x133/0x150
[5408863.344374]  [<ffffffffa002d723>] _channel_create+0x33/0x40 [lttng_ring_buffer_client_discard]
[5408863.370723]  [<ffffffffa3a077aa>] lttng_channel_create+0x14a/0x2c0 [lttng_tracer]
[5408863.393678]  [<ffffffffa3a0912f>] lttng_abi_create_channel+0xcf/0x220 [lttng_tracer]
[5408863.417413]  [<ffffffffa3a0985a>] lttng_session_ioctl+0x6a/0xa0 [lttng_tracer]
[5408863.439589]  [<ffffffff8111eddf>] do_vfs_ioctl+0x8f/0x500
[5408863.456299]  [<ffffffff8110e4b2>] ? fget_light+0x92/0x100
[5408863.473019]  [<ffffffff8111f2e1>] sys_ioctl+0x91/0xa0
[5408863.488699]  [<ffffffff815890eb>] system_call_fastpath+0x16/0x1b

This is on a 3.2.6 kernel.

Which architecture and kernel version are you using ? You might be
hitting a kernel page allocator bug. Also, what is the content of your
kernel .config ?

Thanks,

Mathieu

> 
> 
> Regards,
> S.
> 
> --
> Salman Rafiq
> Industrial Communication
> Fraunhofer-Einrichtung für Systeme der Kommunikationstechnik ESK
> 
> Hansastraße 32 | 80686 München
> Telefon, Fax:  +49 89 547088-356 | +49 89 547088-66-356
> E-Mail:   salman.rafiq at esk.fraunhofer.de
> 
> Internet:
> http://www.esk.fraunhofer.de
> http://www.facebook.com/FraunhoferESK
> http://www.twitter.com/FraunhoferESK
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Mathieu Desnoyers [mailto:mathieu.desnoyers at efficios.com]
> Gesendet: Monday, April 16, 2012 4:33 PM
> An: Oestman, Fredrik
> Cc: lttng-dev at lists.lttng.org; Salman Rafiq
> Betreff: Re: [lttng-dev] Babeltrace Buffer Warning
> 
> * Oestman, Fredrik (Fredrik_Oestman at mentor.com) wrote:
> > Mathieu Desnoyers wrote:
> > > The warning message is:
> > >
> > > "[warning] Tracer discarded 280625 events between
> > > [1334315833.578351009] and [1334315833.578351380]. You should
> > > consider increasing the buffer size."
> > >
> > > which states that the tracer did discard the events (not the
> > > converter).
> > > But maybe I'm missing your point. What would you recommend to make
> > > this message clearer ?
> >
> > I actually had no recommendation or suggestion in mind. To me the message is clear. I just got the impression that the OP may have been confused in this respect for some reason.
> >
> > But since you ask, what about something along the lines of:
> >
> > "[warning] The tracer discarded 280625 events between
> > [1334315833.578351009] and [1334315833.578351380]. You should consider
> > making a new trace with larger buffers or with fewer event types enabled."
> 
> Good idea. Fixed in commit:
> 
> commit a0b34fbba498a8a447357adf5920ac2c8a22f026
> Author: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
> Date:   Mon Apr 16 10:33:55 2012 -0400
> 
>     Clarify warning message for events discarded
> 
>     Reported-by: Fredrik Oestman <Fredrik_Oestman at mentor.com>
>     Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
> 
> Thanks!
> 
> Mathieu
> 
> 
> >
> >
> > Cheers,
> >
> > Fredrik Östman
> >
> 
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant EfficiOS Inc.
> http://www.efficios.com

Content-Description: command_line.txt
> Message from syslogd at oje at Apr 16 17:29:27 ...
>  kernel:[2777781.860828] general protection fault: 0000 [#1] SMP
> 
> Message from syslogd at oje at Apr 16 17:29:27 ...
>  kernel:[2777781.864740] Stack:
> 
> Message from syslogd at oje at Apr 16 17:29:27 ...
>  kernel:[2777781.864740] Call Trace:
> 
> Message from syslogd at oje at Apr 16 17:29:27 ...
>  kernel:[2777781.864740] Code: 48 8b 7b 08 48 89 c5 48 8d 43 08 48 39 c7 75 05 ff 43 04 eb 05 e8 10 e1 27 00 48 89 df 48 89 ee 5b 5b 5d e9 b5 e6 27 00 90 eb 0c <8b> 50 10 39 56 10 7f 0c 48 8d 78 08 48 8b 07 48 85 c0 75 ec 48
Content-Description: syslogd.txt
> Apr 16 17:29:27 oje kernel: [2777781.864435] CPU 4
> Apr 16 17:29:27 oje kernel: [2777781.864466] Modules linked in: lttng_probe_timer(O) lttng_probe_statedump(O) lttng_probe_signal(O) lttng_probe_sched(O) lttng_probe_kvm(O) lttng_probe_irq(O) lttng_probe_block(O) lttng_types(O) lttng_probe_lttng(O) lttng_ring_buffer_metadata_mmap_client(O) lttng_ring_buffer_client_mmap_overwrite(O) lttng_ring_buffer_client_mmap_discard(O) lttng_ring_buffer_metadata_client(O) lttng_ring_buffer_client_overwrite(O) lttng_ring_buffer_client_discard(O) lttng_tracer(O) lttng_kretprobes(O) lttng_kprobes(O) lttng_statedump(O) lttng_lib_ring_buffer(O) bnep rfcomm bluetooth rfkill crc16 parport_pc ppdev lp parport binfmt_misc fuse nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc loop radeon joydev ttm drm_kms_helper drm edac_core shpchp nv_tco psmouse i2c_nforce2 i2c_algo_bit power_supply pci_hotplug pcspkr edac_mce_amd i2c_core evdev k10temp serio_raw ext3 jbd mbcache sg sr_mod sd_mod cdrom crc_t10dif usbhid hid ata_generic pata_amd ohci_hcd sata_nv libata ehci_hcd igb e1000 scsi_mod
> Apr 16 17:29:27 oje kernel: usbcore dca [last unloaded: lttng_kretprobes]
> Apr 16 17:29:27 oje kernel: [2777781.864740]
> Apr 16 17:29:27 oje kernel: [2777781.864740] Pid: 5746, comm: lttng-sessiond Tainted: G        W  O 3.0.0opteron #4 Supermicro H8QM3/H8QM3
> Apr 16 17:29:27 oje kernel: [2777781.864740] RIP: 0010:[<ffffffff8105ac32>]  [<ffffffff8105ac32>] notifier_chain_register+0x2/0x20
> Apr 16 17:29:27 oje kernel: [2777781.864740] RSP: 0018:ffff880077083bc0  EFLAGS: 00010282
> Apr 16 17:29:27 oje kernel: [2777781.864740] RAX: a8c00e1ea8c0f2e6 RBX: ffff8800bb113c50 RCX: 0000000000000010
> Apr 16 17:29:27 oje kernel: [2777781.864740] RDX: 0000000001082e40 RSI: ffff8800bb113c50 RDI: ffff880030ffb858
> Apr 16 17:29:27 oje kernel: [2777781.864740] RBP: ffffffffa03b5270 R08: 0000000000000200 R09: ffffffff8167eff0
> Apr 16 17:29:27 oje kernel: [2777781.864740] R10: 00000000000000c2 R11: 00000000000000c2 R12: 00000000ffffffff
> Apr 16 17:29:27 oje kernel: [2777781.864740] R13: ffff8801273fbc80 R14: 0000000010000000 R15: 0000000000000010
> Apr 16 17:29:27 oje kernel: [2777781.864740] FS:  00007f7bd96a9700(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
> Apr 16 17:29:27 oje kernel: [2777781.864740] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Apr 16 17:29:27 oje kernel: [2777781.864740] CR2: 00000000023332b0 CR3: 000000012d04b000 CR4: 00000000000006e0
> Apr 16 17:29:27 oje kernel: [2777781.864740] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Apr 16 17:29:27 oje kernel: [2777781.864740] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Apr 16 17:29:27 oje kernel: [2777781.864740] Process lttng-sessiond (pid: 5746, threadinfo ffff880077082000, task ffff8800772dc770)
> Apr 16 17:29:27 oje kernel: [2777781.864740]  ffffffff812c0ee0 ffff8801273fbc80 0000000010000000 ffff8800bb113c10
> Apr 16 17:29:27 oje kernel: [2777781.864740]  ffffffffa03a00e3 0000000000000246 0000001010000000 ffff8801273fbc80
> Apr 16 17:29:27 oje kernel: [2777781.864740]  ffff8800bb113c00 ffffffffa03b5270 ffffffffa0411188 ffff8800bb113c10
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffff812c0ee0>] ? register_cpu_notifier+0x1c/0x2f
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffffa03a00e3>] ? channel_backend_init+0x189/0x2a0 [lttng_lib_ring_buffer]
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffffa03a06ac>] ? channel_create+0x76/0x1ce [lttng_lib_ring_buffer]
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffffa03b44ac>] ? _channel_create+0x2b/0x30 [lttng_ring_buffer_client_discard]
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffffa03ce622>] ? lttng_channel_create+0xeb/0x165 [lttng_tracer]
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffffa03cf734>] ? lttng_abi_create_channel+0x122/0x1c5 [lttng_tracer]
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffff810c7172>] ? page_move_anon_rmap+0x1c/0x28
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffffa03cfbdd>] ? lttng_session_ioctl+0x64/0x71 [lttng_tracer]
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffff810f4455>] ? do_vfs_ioctl+0x45c/0x49d
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffff810f71d8>] ? dput+0xd8/0xe4
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffff810fbd02>] ? cpumask_next+0x16/0x18
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffff810fcc74>] ? mnt_get_count+0x32/0x41
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffff810f44e1>] ? sys_ioctl+0x4b/0x72
> Apr 16 17:29:27 oje kernel: [2777781.864740]  [<ffffffff812de292>] ? system_call_fastpath+0x16/0x1b
> Apr 16 17:29:27 oje kernel: [2777781.864740]  RSP <ffff880077083bc0>
> Apr 16 17:29:27 oje kernel: [2777782.091016] ---[ end trace fcc54233a1dc458e ]---

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list