[lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch specific optimization

Anders Wallin wallinux at gmail.com
Thu Dec 28 14:13:22 UTC 2017


Hi Mathieu,

I finally got some time to dig into this issue. The crash only happens when
metadata is written AND the size of the metadata will end up in a write
that is 8,4,2 or 1 bytes long AND
that the source or destination is not aligned correctly according to HW
limitation. I have not found any simple way to keep the performance
enhancement code that is run most of the time.
Maybe the metadata writes should have it's own write function instead.

Here is an example of a crash (code is from lttng-ust 2.9.1 and lttng-tools
2.9.6) where the size is 8 bytes and the src address is unaligned at
0xf3b7eeb2;

#0  lttng_inline_memcpy (len=8, src=0xf3b7eeb2, dest=<optimized out>) at
/usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend_internal.h:610
No locals.
#1  lib_ring_buffer_write (len=8, src=0xf3b7eeb2, ctx=0xf57c47d0,
config=0xf737c560 <client_config>) at
/usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend.h:100
        __len = 8
        handle = 0xf3b2e0c0
        backend_pages = <optimized out>
        chanb = 0xf3b2e2e0
        offset = <optimized out>

#2  lttng_event_write (ctx=0xf57c47d0, src=0xf3b7eeb2, len=8) at
/usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust/lttng-ring-buffer-metadata-client.h:267
No locals.

#3  0xf7337ef8 in ustctl_write_one_packet_to_channel (channel=<optimized
out>, metadata_str=0xf3b7eeb2 "", len=<optimized out>) at
/usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust-ctl/ustctl.c:1183
        ctx = {chan = 0xf3b2e290, priv = 0x0, handle = 0xf3b2e0c0,
data_size = 8, largest_align = 1, cpu = -1, buf = 0xf6909000, slot_size =
8, buf_offset = 163877, pre_offset = 163877, tsc = 0, rflags = 0, ctx_len =
80, ip = 0x0, priv2 = 0x0, padding2 = '\000' <repeats 11 times>,
backend_pages = 0xf690c000}
        chan = 0xf3b2e4d8
        str = 0xf3b7eeb2 ""
        reserve_len = 8
        ret = <optimized out>
        __func__ = '\000' <repeats 34 times>
        __PRETTY_FUNCTION__ = '\000' <repeats 34 times>
---Type <return> to continue, or q <return> to quit---

#4  0x000344cc in commit_one_metadata_packet (stream=stream at entry=0xf3b2e560)
at ust-consumer.c:2206
        write_len = <optimized out>
        ret = <optimized out>
        __PRETTY_FUNCTION__ = "commit_one_metadata_packet"

#5  0x00036538 in lttng_ustconsumer_read_subbuffer
(stream=stream at entry=0xf3b2e560,
ctx=ctx at entry=0x25e6e8) at ust-consumer.c:2452
        len = 4096
        subbuf_size = 4093
        padding = <optimized out>
        err = -11
        write_index = 1
        ret = <optimized out>
        ustream = <optimized out>
        index = {offset = 0, packet_size = 575697416355872, content_size =
17564043391468256584, timestamp_begin = 17564043425827782792, timestamp_end
= 34359738496,

Regards
Anders

fre 24 nov. 2017 kl 20:18 skrev Mathieu Desnoyers <
mathieu.desnoyers at efficios.com>:

> ----- On Nov 24, 2017, at 3:23 AM, Anders Wallin <wallinux at gmail.com>
> wrote:
>
> Hi,
> architectures that has memory alignment restrictions may/will fail with
> the
> optimization done in 51b8f2fa2b972e62117caa946dd3e3565b6ca4a3.
> Please revert the patch or make it X86 specific.
>
>
> Hi Anders,
>
> This was added in the development cycle of lttng-ust 2.9. We could perhaps
> add a test on the pointer alignment for architectures that care about it,
> and
> fallback to memcpy in those cases.
>
> The revert approach would have been justified if this commit had been
> backported
> as a "fix" to a stable branch, which is not the case here. We should work
> on
> finding an acceptable solution that takes care of dealing with unaligned
> pointers
> on architectures that care about the difference.
>
> Thanks,
>
> Mathieu
>
>
>
> Regards
>
> Anders Wallin
>
> ------------------------------------------------------------------------------------------------------------
> commit 51b8f2fa2b972e62117caa946dd3e3565b6ca4a3
> Author: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
> Date:   Sun Sep 25 12:31:11 2016 -0400
>
>     Performance: implement lttng_inline_memcpy
>     Because all length parameters received for serializing data coming from
>     applications go through a callback, they are never constant, and it
>     hurts performance to perform a call to memcpy each time.
>     Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
>
> diff --git a/libringbuffer/backend_internal.h
> b/libringbuffer/backend_internal.h
> index 90088b89..e597cf4d 100644
> --- a/libringbuffer/backend_internal.h
> +++ b/libringbuffer/backend_internal.h
> @@ -592,6 +592,28 @@ int update_read_sb_index(const struct
> lttng_ust_lib_ring_buffer_config *config,
>  #define inline_memcpy(dest, src, n)    memcpy(dest, src, n)
>  #endif
> +static inline __attribute__((always_inline))
> +void lttng_inline_memcpy(void *dest, const void *src,
> +               unsigned long len)
> +{
> +       switch (len) {
> +       case 1:
> +               *(uint8_t *) dest = *(const uint8_t *) src;
> +               break;
> +       case 2:
> +               *(uint16_t *) dest = *(const uint16_t *) src;
> +               break;
> +       case 4:
> +               *(uint32_t *) dest = *(const uint32_t *) src;
> +               break;
> +       case 8:
> +               *(uint64_t *) dest = *(const uint64_t *) src;
> +               break;
> +       default:
> +               inline_memcpy(dest, src, len);
> +       }
> +}
> +
>  /*
>   * Use the architecture-specific memcpy implementation for constant-sized
>   * inputs, but rely on an inline memcpy for length statically unknown.
> @@ -603,7 +625,7 @@ do {
>               \
>         if (__builtin_constant_p(len))                          \
>                 memcpy(dest, src, __len);                       \
>         else                                                    \
> -               inline_memcpy(dest, src, __len);                \
> +               lttng_inline_memcpy(dest, src, __len);          \
>  } while (0)
>  /*
>
>
> ----------------------------------------------------------------------------------------------------
> Here is one example of a crash, where 0xf3b7eeb2 is not 8 byte aligned;
>
>   (gdb) bt full
> #0  lttng_inline_memcpy (len=8, src=0xf3b7eeb2, dest=<optimized out>) at
> /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend_internal.h:610
> No locals.
> #1  lib_ring_buffer_write (len=8, src=0xf3b7eeb2, ctx=0xf57c47d0,
> config=0xf737c560 <client_config>) at
> /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend.h:100
>         __len = 8
>         handle = 0xf3b2e0c0
>         backend_pages = <optimized out>
>         chanb = 0xf3b2e2e0
>         offset = <optimized out>
> #2  lttng_event_write (ctx=0xf57c47d0, src=0xf3b7eeb2, len=8) at
> /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust/lttng-ring-buffer-metadata-client.h:267
> No locals.
> #3  0xf7337ef8 in ustctl_write_one_packet_to_channel (channel=<optimized
> out>, metadata_str=0xf3b7eeb2 "", len=<optimized out>) at
> /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust-ctl/ustctl.c:1183
>         ctx = {chan = 0xf3b2e290, priv = 0x0, handle = 0xf3b2e0c0,
> data_size = 8, largest_align = 1, cpu = -1, buf = 0xf6909000, slot_size =
> 8, buf_offset = 163877, pre_offset = 163877, tsc = 0, rflags = 0, ctx_len =
> 80, ip = 0x0, priv2 = 0x0, padding2 = '\000' <repeats 11 times>,
> backend_pages = 0xf690c000}
>         chan = 0xf3b2e4d8
>         str = 0xf3b7eeb2 ""
>         reserve_len = 8
>         ret = <optimized out>
>         __func__ = '\000' <repeats 34 times>
>         __PRETTY_FUNCTION__ = '\000' <repeats 34 times>
> ---Type <return> to continue, or q <return> to quit---
> #4  0x000344cc in commit_one_metadata_packet (stream=stream at entry=0xf3b2e560)
> at ust-consumer.c:2206
>         write_len = <optimized out>
>         ret = <optimized out>
>         __PRETTY_FUNCTION__ = "commit_one_metadata_packet"
> #5  0x00036538 in lttng_ustconsumer_read_subbuffer (stream=stream at entry=0xf3b2e560,
> ctx=ctx at entry=0x25e6e8) at ust-consumer.c:2452
>         len = 4096
>         subbuf_size = 4093
>         padding = <optimized out>
>         err = -11
>         write_index = 1
>         ret = <optimized out>
>         ustream = <optimized out>
>         index = {offset = 0, packet_size = 575697416355872, content_size =
> 17564043391468256584, timestamp_begin = 17564043425827782792, timestamp_end
> = 34359738496, events_discarded = 313327092840, stream_id = 4089446416,
> stream_instance_id = 17689097291346376608, packet_seq_num = 8589934593}
>         __PRETTY_FUNCTION__ = "lttng_ustconsumer_read_subbuffer"
>         __func__ = "lttng_ustconsumer_read_subbuffer"
> .....
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20171228/fa5e6655/attachment.html>


More information about the lttng-dev mailing list