[lttng-dev] core seen in relayd_cleanup()

Jérémie Galarneau jeremie.galarneau at efficios.com
Tue Nov 29 19:40:40 UTC 2016


On 29 November 2016 at 14:19, Aravind HT <aravind.ht at gmail.com> wrote:
> Hi,
>
>
> I was in the processes of upgrading to 2.8.1 and see the below relayd core.
> I am trying to get the logs for this but is proving hard to reproduce this
> scenario with full logs enabled and also as this is happening in a complex
> environment, not sure why relayd is exiting.
>
> For the meantime, from https://lwn.net/Articles/573432/  , I see that
> cds_lfht_destroy() may fail in case the ht is not empty. Should the assert()
> be there in relayd_cleanup() or do you know what can cause/simulate this
> crash ?

Hi Aravind,

The assert() is there because, at that point, the streams should have
been "torn down" cleanly before exiting. It is a check to ensure we
don't leak the object and, more importantly, properly close the trace
on disk.

I am not sure what can cause the stream to be "dangling" at this
point. Are you killing the relay daemon or is it closing on its own
(presumably after encountering an error)?.

Thanks,
Jérémie

> Ive tried killing relayd with SIGINT, SIGTERM when sesions are being active
> and that doesnt reproduce this.
>
> 55      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> [Current thread is 1 (Thread 0x7fd7e7c0e980 (LWP 29366))]
> (gdb) info threads
>   Id   Target Id         Frame
>   2    Thread 0x7fd7e1b1c700 (LWP 29375) 0x00007fd7e6701996 in _int_free
> (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4059
> * 1    Thread 0x7fd7e7c0e980 (LWP 29366) 0x00007fd7e66bd367 in __GI_raise
> (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
> (gdb) bt full
> #0  0x00007fd7e66bd367 in __GI_raise (sig=sig at entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:55
>         resultvar = 0
>         pid = 29366
>         selftid = 29366
> #1  0x00007fd7e66c033a in __GI_abort () at abort.c:89
>         save_stage = 2
>         act = {__sigaction_handler = {sa_handler = 0x4, sa_sigaction = 0x4},
> sa_mask = {__val = {5, 48, 140732823353744, 3924933264, 0, 0, 0,
> 21474836480, 140732823353896, 140565261773123,
>               140732823353696, 140565261807824, 140565261789000,
> 140732823354144, 140565282869248, 140565261789000}}, sa_flags = 4462201,
> sa_restorer = 0x4416c0 <__PRETTY_FUNCTION__.5555>}
>         sigs = {__val = {32, 0 <repeats 15 times>}}
> #2  0x00007fd7e66b644d in __assert_fail_base (fmt=0x7fd7e67f2748 "%s%s%s:%u:
> %s%sAssertion `%s' failed.\n%n", assertion=assertion at entry=0x441679 "!ret",
> file=file at entry=0x4415da "hashtable.c",
>     line=line at entry=162, function=function at entry=0x4416c0
> <__PRETTY_FUNCTION__.5555> "lttng_ht_destroy") at assert.c:92
>         str = 0x65b310 ""
>         total = 4096
> #3  0x00007fd7e66b6502 in __GI___assert_fail (assertion=0x441679 "!ret",
> file=0x4415da "hashtable.c", line=162, function=0x4416c0
> <__PRETTY_FUNCTION__.5555> "lttng_ht_destroy") at assert.c:101
> No locals.
> #4  0x0000000000422f17 in lttng_ht_destroy (ht=0x65a0e0) at hashtable.c:162
>         ret = -1
>         __PRETTY_FUNCTION__ = "lttng_ht_destroy"
> #5  0x000000000040803b in relayd_cleanup () at main.c:497
>         __func__ = "relayd_cleanup"
> #6  0x000000000040ec4c in main (argc=5, argv=0x7ffee9f1bd68) at main.c:2939
>         ret = 0
>         retval = 0
>         status = 0x0
>         __func__ = "main"
>
> (gdb) f 4
> #4  0x0000000000422f17 in lttng_ht_destroy (ht=0x65a0e0) at hashtable.c:162
> 162     hashtable.c: No such file or directory.
> (gdb) p *ht->ht
> $1 = {max_nr_buckets = 9223372036854775808, mm = 0x7fd7e6e50920
> <cds_lfht_mm_order>,
> flavor = 0x7fd7e77b6340 <rcu_flavor_memb>, count = -1024, resize_mutex =
> {__data = {
> __lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0,
>       __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
>     __size = '\000' <repeats 39 times>, __align = 0}, resize_attr = 0x0,
>   in_progress_resize = 0, in_progress_destroy = 1, resize_target = 2048,
>   resize_initiated = 0, flags = 3, min_alloc_buckets_order = 0,
> min_nr_alloc_buckets = 1,
>   split_count = 0x65b170, .., bucket_at = 0x7fd7e6c4e080 <bucket_at>, {
>     tbl_order = {0x65b280, 0x65b2a0, 0x65b2c0, 0x7fd7c40008c0,
> 0x7fd7c4000910,
>       0x7fd7c40009a0, 0x7fd7c4000ab0, 0x7fd7c4001070, 0x7fd7c4001480,
> 0x7fd7c4001c90,
>       0x7fd7c4003350, 0x7fd7c4005360, 0x0 <repeats 52 times>}, tbl_chunk =
> 0x65af60,
>     tbl_mmap = 0x65b280}}
>
>
> Regards,
> Aravind.
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>



-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com


More information about the lttng-dev mailing list