[lttng-dev] snapshot issue with 2.5

Anders Wallin wallinux at gmail.com
Thu Nov 13 02:33:07 EST 2014


On Wed, Nov 12, 2014 at 11:30 PM, Jérémie Galarneau <
jeremie.galarneau at efficios.com> wrote:

> On Wed, Nov 12, 2014 at 7:39 AM, Anders Wallin <wallinux at gmail.com> wrote:
> > Hi,
> >
> > using userspace snapshot mode in 2.5 seems to broken.
> >
> > I have small stupid application(tracetest) generating 1000 events
> >
> > Running the script (the below is simplified version of the script)
> >
> ----------------------------------------------------------------------------------
> > #!/bin/bash
> > LTTNG='lttng -n'
> > lttng-sessiond --no-kernel &
>
> You should use the "-d" flag to launch the sessiond as a daemon. Using
> "&" might cause the "create" command to launch a second session daemon
> since the first one would not have had the time to open create its
> command sockets.
>
No reason, but I changed it now "-d". On my laptop I already have a
sessiond running for kernelspace
and I wanted my test case to be as simple as possible to prove the faulty
behavior.
I have an ARM and a powerpc target as well and on them I'm running the same
test w/o a sessiond for the kernel
and not tracing kernel at all.


>
> >
> > $LTTNG create test --snapshot
> > $LTTNG enable-channel -u --subbuf-size 64k --num-subbuf 16 --overwrite
> ch1
> > $LTTNG enable-event -u -c ch1 "*"
> > $LTTNG start
> > $LTTNG list ch1
> >
> > mkdir -p snapshot/
> >
> > for i in $(seq 1 $LOOPS); do
> >     ./tracetest
> >     $LTTNG snapshot record -m 1G
> >
> >     # print no of events and first and last event
> >     last=$(ls -1drt $HOME/lttng-traces/$SESSION*/* | tail -1)
> >     babeltrace $last > snapshot/$i.bt
> >     cat snapshot/$i.bt | wc -l
> >     cat snapshot/$i.bt | head -1
> >     cat snapshot/$i.bt | tail -1
> > done
> >
> > $LTTNG stop
> > $LTTNG destroy $SESSION
> >
> ----------------------------------------------------------------------------------
> >
> > The result of the run looks like this;
> > ./test.run
> > Error: consumer err socket second poll error
> > Error: Health error occurred in thread_manage_consumer
> > PERROR - 13:36:59.685167 [23772/23781]: bind inet: Address already in use
> > (in lttcomm_bind_inet_sock() at inet.c:109)
> > Warning: Another session daemon is using this JUL port. JUL support will
> be
> > deactivated to prevent interfering with the tracing.
> >
>
> Is there a reason why you run two session daemons? I'm guessing this
> is related to my "& vs -d" comment above.


> > lttng (LTTng Trace Control) 2.5.2 - Fumisterie
> > BabelTrace Trace Viewer and Converter 1.2.4
> >         linux-vdso.so.1 =>  (0x00007fffba966000)
> >         liblttng-ctl.so.0 => /usr/lib/x86_64-linux-gnu/liblttng-ctl.so.0
> > (0x00007fbfe4b1d000)
> >         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1
> (0x00007fbfe4915000)
> >         libxml2.so.2 => /usr/lib/x86_64-linux-gnu/libxml2.so.2
> > (0x00007fbfe45ae000)
> >         libpopt.so.0 => /lib/x86_64-linux-gnu/libpopt.so.0
> > (0x00007fbfe43a2000)
> >         libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> > (0x00007fbfe4184000)
> >         libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fbfe3dbd000)
> >         /lib64/ld-linux-x86-64.so.2 (0x00007fbfe4fac000)
> >         liburcu.so.2 => /usr/lib/x86_64-linux-gnu/liburcu.so.2
> > (0x00007fbfe3bb6000)
> >         libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2
> (0x00007fbfe39b2000)
> >         libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fbfe3798000)
> >         liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5
> > (0x00007fbfe3576000)
> >         libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fbfe3270000)
> >
> > Session test created.
> > Default snapshot output set to:
> > /home/awallin/lttng-traces/test-20141112-133700
> > Snapshot mode set. Every channel enabled for that session will be set in
> > overwrite mode and mmap output.
> > UST channel ch1 enabled for session test
> > UST event * created in channel ch1
> > Tracing started for session test
> > Tracing session test: [active snapshot]
> >     Trace path:
> >     Live timer interval (usec): 4294967295
> >
> > === Domain: UST global ===
> >
> > Channels:
> > -------------
> > - ch1: [enabled]
> >
> >     Attributes:
> >       overwrite mode: 1
> >       subbufers size: 4096
> >       number of subbufers: 16
> >       switch timer interval: 0
> >       read timer interval: 0
> >       trace file count: 0
> >       trace file size (bytes): 0
> >       output: mmap()
> >
> >     Events:
> >       * (type: tracepoint) [enabled]
> >
> > Snapshot recorded successfully for session test
> > 84
> > [13:37:00.852615950] (+?.?????????) arn-awallin-mint-l3
> > ust_baddr_statedump:soinfo: { cpu_id = 4 }, { baddr = 0x7FFFC6BC7000,
> sopath
> > = "[vdso]", size = 0, mtime = -1 }
> > [13:37:00.853798949] (+0.000000805) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 157
> > [13:37:00.853735195] (+?.?????????) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 6 }, { my_string_field = "test", my_integer_field = 927 }
> > [13:37:00.894839116] (+0.000000299) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 168
> > [13:37:00.853735195] (+?.?????????) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 6 }, { my_string_field = "test", my_integer_field = 927 }
> > [13:37:00.929447330] (+0.000000305) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 168
> > [13:37:00.894411117] (+?.?????????) arn-awallin-mint-l3
> > ust_baddr_statedump:soinfo: { cpu_id = 4 }, { baddr = 0x7FFF6BF81000,
> sopath
> > = "[vdso]", size = 0, mtime = -1 }
> > [13:37:00.972035546] (+0.000000597) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 168
> > [13:37:00.894411117] (+?.?????????) arn-awallin-mint-l3
> > ust_baddr_statedump:soinfo: { cpu_id = 4 }, { baddr = 0x7FFF6BF81000,
> sopath
> > = "[vdso]", size = 0, mtime = -1 }
> > [13:37:01.014060940] (+0.000000341) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 7 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 168
> > [13:37:00.894411117] (+?.?????????) arn-awallin-mint-l3
> > ust_baddr_statedump:soinfo: { cpu_id = 4 }, { baddr = 0x7FFF6BF81000,
> sopath
> > = "[vdso]", size = 0, mtime = -1 }
> > [13:37:01.052555877] (+0.000000872) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 168
> > [13:37:01.014035993] (+?.?????????) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 7 }, { my_string_field = "test", my_integer_field = 927 }
> > [13:37:01.098620808] (+0.000001404) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 4 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 168
> > [13:37:01.097087388] (+?.?????????) arn-awallin-mint-l3
> > ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF4BFFE000,
> sopath
> > = "[vdso]", size = 0, mtime = -1 }
> > [13:37:01.129839830] (+0.000000533) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 7 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 168
> > [13:37:01.097087388] (+?.?????????) arn-awallin-mint-l3
> > ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF4BFFE000,
> sopath
> > = "[vdso]", size = 0, mtime = -1 }
> > [13:37:01.129839830] (+0.000000533) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 7 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 168
> > [13:37:01.097087388] (+?.?????????) arn-awallin-mint-l3
> > ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF4BFFE000,
> sopath
> > = "[vdso]", size = 0, mtime = -1 }
> > [13:37:01.172829105] (+0.000000575) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
> > Snapshot recorded successfully for session test
> > 168
> > [13:37:01.172021557] (+?.?????????) arn-awallin-mint-l3
> > ust_baddr_statedump:soinfo: { cpu_id = 7 }, { baddr = 0x7FFF653FE000,
> sopath
> > = "[vdso]", size = 0, mtime = -1 }
> > [13:37:01.220074165] (+0.000000553) arn-awallin-mint-l3
> tracetest:first_tp:
> > { cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> > Waiting for data availability
> > Tracing stopped for session test
> > Session test destroyed
> >
> > the result from the run on older versions of lttng creates bigger and
> bigger
> > event traces until reaching max size.
> >
>
> Unless I'm mistaken, the behavior of the snapshot mode has changed
> between 2.4 and 2.5.
>
> In the 2.4 series, consecutive snapshots could overlap; that is,
> events could be present in multiple snapshots. As of 2.5, an event is
> guaranteed to only appear in one snapshot as the events are considered
> as having been "consumed" at that point.
>
> Julien (cc'ed) might be able to confirm this.
>
> Regards,
> Jérémie
>
> > I have tested this on different architectures, with different buffer
> sizes
> > and number of buffer
> >
> > Regards
> > Anders Wallin
> >
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev at lists.lttng.org
> > http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> >
>
>
>
> --
> Jérémie Galarneau
> EfficiOS Inc.
> http://www.efficios.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20141113/93e9ff9e/attachment-0001.html>


More information about the lttng-dev mailing list