<div dir="auto"><div>Hi Matthieu,<div dir="auto"><br></div><div dir="auto">I've retired and no longer have access to any arch64  target to test it on. </div><div dir="auto"><br></div><div dir="auto">Regards</div><div dir="auto">Anders</div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Den ons 25 jan. 2023 13:25Mathieu Desnoyers <<a href="mailto:mathieu.desnoyers@efficios.com">mathieu.desnoyers@efficios.com</a>> skrev:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Anders,<br>
<br>
Sorry for the long delay on this one, can you have a look at the following fix ?<br>
<br>
<a href="https://review.lttng.org/c/lttng-ust/+/9319" rel="noreferrer noreferrer" target="_blank">https://review.lttng.org/c/lttng-ust/+/9319</a> Fix: aarch64: do not perform unaligned stores<br>
<br>
If it passes your testing, I'll merge this into lttng-ust.<br>
<br>
Thanks,<br>
<br>
Mathieu<br>
<br>
On 2017-12-28 09:13, Anders Wallin wrote:<br>
> Hi Mathieu,<br>
> <br>
> I finally got some time to dig into this issue. The crash only happens <br>
> when metadata is written AND the size of the metadata will end up in a <br>
> write that is 8,4,2 or 1 bytes long AND<br>
> that the source or destination is not aligned correctly according to HW <br>
> limitation. I have not found any simple way to keep the performance <br>
> enhancement code that is run most of the time.<br>
> Maybe the metadata writes should have it's own write function instead.<br>
> <br>
> Here is an example of a crash (code is from lttng-ust 2.9.1 and <br>
> lttng-tools 2.9.6) where the size is 8 bytes and the src address is <br>
> unaligned at 0xf3b7eeb2;<br>
> <br>
> #0  lttng_inline_memcpy (len=8, src=0xf3b7eeb2, dest=<optimized out>) at <br>
> /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend_internal.h:610<br>
> No locals.<br>
> #1  lib_ring_buffer_write (len=8, src=0xf3b7eeb2, ctx=0xf57c47d0, <br>
> config=0xf737c560 <client_config>) at <br>
> /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend.h:100<br>
>          __len = 8<br>
>          handle = 0xf3b2e0c0<br>
>          backend_pages = <optimized out><br>
>          chanb = 0xf3b2e2e0<br>
>          offset = <optimized out><br>
> <br>
> #2  lttng_event_write (ctx=0xf57c47d0, src=0xf3b7eeb2, len=8) at <br>
> /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust/lttng-ring-buffer-metadata-client.h:267<br>
> No locals.<br>
> <br>
> #3  0xf7337ef8 in ustctl_write_one_packet_to_channel (channel=<optimized <br>
> out>, metadata_str=0xf3b7eeb2 "", len=<optimized out>) at <br>
> /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust-ctl/ustctl.c:1183<br>
>          ctx = {chan = 0xf3b2e290, priv = 0x0, handle = 0xf3b2e0c0, <br>
> data_size = 8, largest_align = 1, cpu = -1, buf = 0xf6909000, slot_size <br>
> = 8, buf_offset = 163877, pre_offset = 163877, tsc = 0, rflags = 0, <br>
> ctx_len = 80, ip = 0x0, priv2 = 0x0, padding2 = '\000' <repeats 11 <br>
> times>, backend_pages = 0xf690c000}<br>
>          chan = 0xf3b2e4d8<br>
>          str = 0xf3b7eeb2 ""<br>
>          reserve_len = 8<br>
>          ret = <optimized out><br>
>          __func__ = '\000' <repeats 34 times><br>
>          __PRETTY_FUNCTION__ = '\000' <repeats 34 times><br>
> ---Type <return> to continue, or q <return> to quit---<br>
> <br>
> #4  0x000344cc in commit_one_metadata_packet <br>
> (stream=stream@entry=0xf3b2e560) at ust-consumer.c:2206<br>
>          write_len = <optimized out><br>
>          ret = <optimized out><br>
>          __PRETTY_FUNCTION__ = "commit_one_metadata_packet"<br>
> <br>
> #5  0x00036538 in lttng_ustconsumer_read_subbuffer <br>
> (stream=stream@entry=0xf3b2e560, ctx=ctx@entry=0x25e6e8) at <br>
> ust-consumer.c:2452<br>
>          len = 4096<br>
>          subbuf_size = 4093<br>
>          padding = <optimized out><br>
>          err = -11<br>
>          write_index = 1<br>
>          ret = <optimized out><br>
>          ustream = <optimized out><br>
>          index = {offset = 0, packet_size = 575697416355872, <br>
> content_size = 17564043391468256584, timestamp_begin = <br>
> 17564043425827782792, timestamp_end = 34359738496,<br>
> Regards<br>
> Anders<br>
> <br>
> fre 24 nov. 2017 kl 20:18 skrev Mathieu Desnoyers <br>
> <<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a> <mailto:<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a>>>:<br>
> <br>
>     ----- On Nov 24, 2017, at 3:23 AM, Anders Wallin <<a href="mailto:wallinux@gmail.com" target="_blank" rel="noreferrer">wallinux@gmail.com</a><br>
>     <mailto:<a href="mailto:wallinux@gmail.com" target="_blank" rel="noreferrer">wallinux@gmail.com</a>>> wrote:<br>
> <br>
>         Hi,<br>
>         architectures that has memory alignment restrictions may/will<br>
>         fail with the<br>
>         optimization done in 51b8f2fa2b972e62117caa946dd3e3565b6ca4a3.<br>
>         Please revert the patch or make it X86 specific.<br>
> <br>
> <br>
>     Hi Anders,<br>
> <br>
>     This was added in the development cycle of lttng-ust 2.9. We could<br>
>     perhaps<br>
>     add a test on the pointer alignment for architectures that care<br>
>     about it, and<br>
>     fallback to memcpy in those cases.<br>
> <br>
>     The revert approach would have been justified if this commit had<br>
>     been backported<br>
>     as a "fix" to a stable branch, which is not the case here. We should<br>
>     work on<br>
>     finding an acceptable solution that takes care of dealing with<br>
>     unaligned pointers<br>
>     on architectures that care about the difference.<br>
> <br>
>     Thanks,<br>
> <br>
>     Mathieu<br>
> <br>
> <br>
> <br>
>         Regards<br>
> <br>
>         Anders Wallin<br>
>         ------------------------------------------------------------------------------------------------------------<br>
>         commit 51b8f2fa2b972e62117caa946dd3e3565b6ca4a3<br>
>         Author: Mathieu Desnoyers <<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a><br>
>         <mailto:<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a>>><br>
>         Date:   Sun Sep 25 12:31:11 2016 -0400<br>
> <br>
>              Performance: implement lttng_inline_memcpy<br>
>              Because all length parameters received for serializing data<br>
>         coming from<br>
>              applications go through a callback, they are never<br>
>         constant, and it<br>
>              hurts performance to perform a call to memcpy each time.<br>
>              Signed-off-by: Mathieu Desnoyers<br>
>         <<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a><br>
>         <mailto:<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a>>><br>
> <br>
>         diff --git a/libringbuffer/backend_internal.h<br>
>         b/libringbuffer/backend_internal.h<br>
>         index 90088b89..e597cf4d 100644<br>
>         --- a/libringbuffer/backend_internal.h<br>
>         +++ b/libringbuffer/backend_internal.h<br>
>         @@ -592,6 +592,28 @@ int update_read_sb_index(const struct<br>
>         lttng_ust_lib_ring_buffer_config *config,<br>
>           #define inline_memcpy(dest, src, n)    memcpy(dest, src, n)<br>
>           #endif<br>
>         +static inline __attribute__((always_inline))<br>
>         +void lttng_inline_memcpy(void *dest, const void *src,<br>
>         +               unsigned long len)<br>
>         +{<br>
>         +       switch (len) {<br>
>         +       case 1:<br>
>         +               *(uint8_t *) dest = *(const uint8_t *) src;<br>
>         +               break;<br>
>         +       case 2:<br>
>         +               *(uint16_t *) dest = *(const uint16_t *) src;<br>
>         +               break;<br>
>         +       case 4:<br>
>         +               *(uint32_t *) dest = *(const uint32_t *) src;<br>
>         +               break;<br>
>         +       case 8:<br>
>         +               *(uint64_t *) dest = *(const uint64_t *) src;<br>
>         +               break;<br>
>         +       default:<br>
>         +               inline_memcpy(dest, src, len);<br>
>         +       }<br>
>         +}<br>
>         +<br>
>           /*<br>
>            * Use the architecture-specific memcpy implementation for<br>
>         constant-sized<br>
>            * inputs, but rely on an inline memcpy for length statically<br>
>         unknown.<br>
>         @@ -603,7 +625,7 @@ do {                                       <br>
>                                  \<br>
>                  if (__builtin_constant_p(len))                          \<br>
>                          memcpy(dest, src, __len);                       \<br>
>                  else                                                    \<br>
>         -               inline_memcpy(dest, src, __len);                \<br>
>         +               lttng_inline_memcpy(dest, src, __len);          \<br>
>           } while (0)<br>
>           /*<br>
> <br>
>         ----------------------------------------------------------------------------------------------------<br>
>         Here is one example of a crash, where 0xf3b7eeb2 is not 8 byte<br>
>         aligned;<br>
> <br>
>            (gdb) bt full<br>
>         #0  lttng_inline_memcpy (len=8, src=0xf3b7eeb2, dest=<optimized<br>
>         out>) at<br>
>         /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend_internal.h:610<br>
>         No locals.<br>
>         #1  lib_ring_buffer_write (len=8, src=0xf3b7eeb2,<br>
>         ctx=0xf57c47d0, config=0xf737c560 <client_config>) at<br>
>         /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend.h:100<br>
>                  __len = 8<br>
>                  handle = 0xf3b2e0c0<br>
>                  backend_pages = <optimized out><br>
>                  chanb = 0xf3b2e2e0<br>
>                  offset = <optimized out><br>
>         #2  lttng_event_write (ctx=0xf57c47d0, src=0xf3b7eeb2, len=8) at<br>
>         /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust/lttng-ring-buffer-metadata-client.h:267<br>
>         No locals.<br>
>         #3  0xf7337ef8 in ustctl_write_one_packet_to_channel<br>
>         (channel=<optimized out>, metadata_str=0xf3b7eeb2 "",<br>
>         len=<optimized out>) at<br>
>         /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust-ctl/ustctl.c:1183<br>
>                  ctx = {chan = 0xf3b2e290, priv = 0x0, handle =<br>
>         0xf3b2e0c0, data_size = 8, largest_align = 1, cpu = -1, buf =<br>
>         0xf6909000, slot_size = 8, buf_offset = 163877, pre_offset =<br>
>         163877, tsc = 0, rflags = 0, ctx_len = 80, ip = 0x0, priv2 =<br>
>         0x0, padding2 = '\000' <repeats 11 times>, backend_pages =<br>
>         0xf690c000}<br>
>                  chan = 0xf3b2e4d8<br>
>                  str = 0xf3b7eeb2 ""<br>
>                  reserve_len = 8<br>
>                  ret = <optimized out><br>
>                  __func__ = '\000' <repeats 34 times><br>
>                  __PRETTY_FUNCTION__ = '\000' <repeats 34 times><br>
>         ---Type <return> to continue, or q <return> to quit---<br>
>         #4  0x000344cc in commit_one_metadata_packet<br>
>         (stream=stream@entry=0xf3b2e560) at ust-consumer.c:2206<br>
>                  write_len = <optimized out><br>
>                  ret = <optimized out><br>
>                  __PRETTY_FUNCTION__ = "commit_one_metadata_packet"<br>
>         #5  0x00036538 in lttng_ustconsumer_read_subbuffer<br>
>         (stream=stream@entry=0xf3b2e560, ctx=ctx@entry=0x25e6e8) at<br>
>         ust-consumer.c:2452<br>
>                  len = 4096<br>
>                  subbuf_size = 4093<br>
>                  padding = <optimized out><br>
>                  err = -11<br>
>                  write_index = 1<br>
>                  ret = <optimized out><br>
>                  ustream = <optimized out><br>
>                  index = {offset = 0, packet_size = 575697416355872,<br>
>         content_size = 17564043391468256584, timestamp_begin =<br>
>         17564043425827782792, timestamp_end = 34359738496,<br>
>         events_discarded = 313327092840, stream_id = 4089446416,<br>
>         stream_instance_id = 17689097291346376608, packet_seq_num =<br>
>         8589934593}<br>
>                  __PRETTY_FUNCTION__ = "lttng_ustconsumer_read_subbuffer"<br>
>                  __func__ = "lttng_ustconsumer_read_subbuffer"<br>
>         .....<br>
> <br>
>         _______________________________________________<br>
>         lttng-dev mailing list<br>
>         <a href="mailto:lttng-dev@lists.lttng.org" target="_blank" rel="noreferrer">lttng-dev@lists.lttng.org</a> <mailto:<a href="mailto:lttng-dev@lists.lttng.org" target="_blank" rel="noreferrer">lttng-dev@lists.lttng.org</a>><br>
>         <a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev" rel="noreferrer noreferrer" target="_blank">https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev</a><br>
>         <<a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev" rel="noreferrer noreferrer" target="_blank">https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev</a>><br>
> <br>
> <br>
>     -- <br>
>     Mathieu Desnoyers<br>
>     EfficiOS Inc.<br>
>     <a href="http://www.efficios.com" rel="noreferrer noreferrer" target="_blank">http://www.efficios.com</a> <<a href="http://www.efficios.com" rel="noreferrer noreferrer" target="_blank">http://www.efficios.com</a>><br>
> <br>
<br>
-- <br>
Mathieu Desnoyers<br>
EfficiOS Inc.<br>
<a href="https://www.efficios.com" rel="noreferrer noreferrer" target="_blank">https://www.efficios.com</a><br>
<br>
</blockquote></div></div></div>