[lttng-dev] Problem running lttng enable-channel shortly after lttng create

Jérémie Galarneau jeremie.galarneau at efficios.com
Tue Apr 8 15:12:57 EDT 2014


On Wed, Apr 2, 2014 at 6:56 AM, Amit Margalit <AMITM at il.ibm.com> wrote:
> Hello,
>
> I keep getting the following error from lttng enable-channel:
>
> libust[2984/2986]: Error: Timed out waiting for lttng-sessiond (in
> lttng_ust_init() at lttng-ust-comm.c:1444)
>
> The scenario is quite simple - during boot of the system, we create a
> session, enable a channel, etc., like this:
>
> # /usr/bin/lttng-sessiond --no-kernel -b
> Warning: No tracing group detected
> # /usr/bin/lttng create nextra -o /local/traces/lttng/20140402123725
> Session nextra created.
> Traces will be written in /local/traces/lttng/20140402123725
> # /usr/bin/lttng enable-channel channel0 -u --tracefile-size 67108864
> --tracefile-count 40 --buffers-uid --switch-timer 1000000 --num-subbuf 16
> --subbuf-size 262144
> libust[2984/2986]: Error: Timed out waiting for lttng-sessiond (in
> lttng_ust_init() at lttng-ust-comm.c:1444)
> Warning: No tracing group detected
>
> The PID 2984 belongs to the lttng-consumerd. At that point in time I see the
> following in syslog:
>
> 2014/04/02 12:38:03.370576 kernel   error    nextra-6663235-module-1  [
> 56.376155] proc lttng-sessiond with pid 632 goes to delayed retry in
> epoll_wait(): start jiffies 4294904844, stop jiffies=4294906390 (+1546),
> expected final stop jiffies=9223372041149680651 (+9223372036854775807), next
> stop jiffies=9223372041149682197 (+9223372036854775807)
> 2014/04/02 12:38:03.399245 kernel   error    nextra-6663235-module-1  [
> 56.405166] proc lttng-sessiond with pid 632 goes to delayed retry in
> epoll_wait(): start jiffies 4294904844, stop jiffies=4294906397 (+1553),
> expected final stop jiffies=9223372041149680651 (+9223372036854775807), next
> stop jiffies=9223372041149682204 (+9223372036854775807)
> 2014/04/02 12:38:03.405700 kernel   error    nextra-6663235-module-1  [
> 56.411791] proc lttng-sessiond with pid 632 goes to delayed retry in
> epoll_wait(): start jiffies 4294904844, stop jiffies=4294906398 (+1554),
> expected final stop jiffies=9223372041149680651 (+9223372036854775807), next
> stop jiffies=9223372041149682205 (+9223372036854775807)
>
> And finally, the command succeeds:
>
> UST channel channel0 enabled for session nextra
>
> My guess is that the boot sequence is loading some traced application before
> the channel is enabled, and the consumerd is started, which under heavy
> system load (exists during initial boot time) times out waiting for the
> sessiond. If that was the case, I could try to increase the timeout time,
> but that also delays my booting, which I cannot afford to do.
>
> Suggestions?
>

How many apps are trying to register to the session daemon at that
moment? Also, can you give us an idea of the system on which this is
running?

By the way, bumping your own thread after ~24 hours is generally
considered rude and won't get you more answers...

Regards,
Jérémie

>
> Amit Margalit
> IBM XIV - Storage Reinvented
> XIV-NAS Development Team
> Tel. 03-689-7774
> Fax. 03-689-7230
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>



-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list