<div dir="auto"><div>Hi Matthieu,<div dir="auto"><br></div><div dir="auto">I've retired and no longer have access to any arch64 target to test it on. </div><div dir="auto"><br></div><div dir="auto">Regards</div><div dir="auto">Anders</div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Den ons 25 jan. 2023 13:25Mathieu Desnoyers <<a href="mailto:mathieu.desnoyers@efficios.com">mathieu.desnoyers@efficios.com</a>> skrev:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Anders,<br>
<br>
Sorry for the long delay on this one, can you have a look at the following fix ?<br>
<br>
<a href="https://review.lttng.org/c/lttng-ust/+/9319" rel="noreferrer noreferrer" target="_blank">https://review.lttng.org/c/lttng-ust/+/9319</a> Fix: aarch64: do not perform unaligned stores<br>
<br>
If it passes your testing, I'll merge this into lttng-ust.<br>
<br>
Thanks,<br>
<br>
Mathieu<br>
<br>
On 2017-12-28 09:13, Anders Wallin wrote:<br>
> Hi Mathieu,<br>
> <br>
> I finally got some time to dig into this issue. The crash only happens <br>
> when metadata is written AND the size of the metadata will end up in a <br>
> write that is 8,4,2 or 1 bytes long AND<br>
> that the source or destination is not aligned correctly according to HW <br>
> limitation. I have not found any simple way to keep the performance <br>
> enhancement code that is run most of the time.<br>
> Maybe the metadata writes should have it's own write function instead.<br>
> <br>
> Here is an example of a crash (code is from lttng-ust 2.9.1 and <br>
> lttng-tools 2.9.6) where the size is 8 bytes and the src address is <br>
> unaligned at 0xf3b7eeb2;<br>
> <br>
> #0 lttng_inline_memcpy (len=8, src=0xf3b7eeb2, dest=<optimized out>) at <br>
> /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend_internal.h:610<br>
> No locals.<br>
> #1 lib_ring_buffer_write (len=8, src=0xf3b7eeb2, ctx=0xf57c47d0, <br>
> config=0xf737c560 <client_config>) at <br>
> /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend.h:100<br>
> __len = 8<br>
> handle = 0xf3b2e0c0<br>
> backend_pages = <optimized out><br>
> chanb = 0xf3b2e2e0<br>
> offset = <optimized out><br>
> <br>
> #2 lttng_event_write (ctx=0xf57c47d0, src=0xf3b7eeb2, len=8) at <br>
> /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust/lttng-ring-buffer-metadata-client.h:267<br>
> No locals.<br>
> <br>
> #3 0xf7337ef8 in ustctl_write_one_packet_to_channel (channel=<optimized <br>
> out>, metadata_str=0xf3b7eeb2 "", len=<optimized out>) at <br>
> /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust-ctl/ustctl.c:1183<br>
> ctx = {chan = 0xf3b2e290, priv = 0x0, handle = 0xf3b2e0c0, <br>
> data_size = 8, largest_align = 1, cpu = -1, buf = 0xf6909000, slot_size <br>
> = 8, buf_offset = 163877, pre_offset = 163877, tsc = 0, rflags = 0, <br>
> ctx_len = 80, ip = 0x0, priv2 = 0x0, padding2 = '\000' <repeats 11 <br>
> times>, backend_pages = 0xf690c000}<br>
> chan = 0xf3b2e4d8<br>
> str = 0xf3b7eeb2 ""<br>
> reserve_len = 8<br>
> ret = <optimized out><br>
> __func__ = '\000' <repeats 34 times><br>
> __PRETTY_FUNCTION__ = '\000' <repeats 34 times><br>
> ---Type <return> to continue, or q <return> to quit---<br>
> <br>
> #4 0x000344cc in commit_one_metadata_packet <br>
> (stream=stream@entry=0xf3b2e560) at ust-consumer.c:2206<br>
> write_len = <optimized out><br>
> ret = <optimized out><br>
> __PRETTY_FUNCTION__ = "commit_one_metadata_packet"<br>
> <br>
> #5 0x00036538 in lttng_ustconsumer_read_subbuffer <br>
> (stream=stream@entry=0xf3b2e560, ctx=ctx@entry=0x25e6e8) at <br>
> ust-consumer.c:2452<br>
> len = 4096<br>
> subbuf_size = 4093<br>
> padding = <optimized out><br>
> err = -11<br>
> write_index = 1<br>
> ret = <optimized out><br>
> ustream = <optimized out><br>
> index = {offset = 0, packet_size = 575697416355872, <br>
> content_size = 17564043391468256584, timestamp_begin = <br>
> 17564043425827782792, timestamp_end = 34359738496,<br>
> Regards<br>
> Anders<br>
> <br>
> fre 24 nov. 2017 kl 20:18 skrev Mathieu Desnoyers <br>
> <<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a> <mailto:<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a>>>:<br>
> <br>
> ----- On Nov 24, 2017, at 3:23 AM, Anders Wallin <<a href="mailto:wallinux@gmail.com" target="_blank" rel="noreferrer">wallinux@gmail.com</a><br>
> <mailto:<a href="mailto:wallinux@gmail.com" target="_blank" rel="noreferrer">wallinux@gmail.com</a>>> wrote:<br>
> <br>
> Hi,<br>
> architectures that has memory alignment restrictions may/will<br>
> fail with the<br>
> optimization done in 51b8f2fa2b972e62117caa946dd3e3565b6ca4a3.<br>
> Please revert the patch or make it X86 specific.<br>
> <br>
> <br>
> Hi Anders,<br>
> <br>
> This was added in the development cycle of lttng-ust 2.9. We could<br>
> perhaps<br>
> add a test on the pointer alignment for architectures that care<br>
> about it, and<br>
> fallback to memcpy in those cases.<br>
> <br>
> The revert approach would have been justified if this commit had<br>
> been backported<br>
> as a "fix" to a stable branch, which is not the case here. We should<br>
> work on<br>
> finding an acceptable solution that takes care of dealing with<br>
> unaligned pointers<br>
> on architectures that care about the difference.<br>
> <br>
> Thanks,<br>
> <br>
> Mathieu<br>
> <br>
> <br>
> <br>
> Regards<br>
> <br>
> Anders Wallin<br>
> ------------------------------------------------------------------------------------------------------------<br>
> commit 51b8f2fa2b972e62117caa946dd3e3565b6ca4a3<br>
> Author: Mathieu Desnoyers <<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a><br>
> <mailto:<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a>>><br>
> Date: Sun Sep 25 12:31:11 2016 -0400<br>
> <br>
> Performance: implement lttng_inline_memcpy<br>
> Because all length parameters received for serializing data<br>
> coming from<br>
> applications go through a callback, they are never<br>
> constant, and it<br>
> hurts performance to perform a call to memcpy each time.<br>
> Signed-off-by: Mathieu Desnoyers<br>
> <<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a><br>
> <mailto:<a href="mailto:mathieu.desnoyers@efficios.com" target="_blank" rel="noreferrer">mathieu.desnoyers@efficios.com</a>>><br>
> <br>
> diff --git a/libringbuffer/backend_internal.h<br>
> b/libringbuffer/backend_internal.h<br>
> index 90088b89..e597cf4d 100644<br>
> --- a/libringbuffer/backend_internal.h<br>
> +++ b/libringbuffer/backend_internal.h<br>
> @@ -592,6 +592,28 @@ int update_read_sb_index(const struct<br>
> lttng_ust_lib_ring_buffer_config *config,<br>
> #define inline_memcpy(dest, src, n) memcpy(dest, src, n)<br>
> #endif<br>
> +static inline __attribute__((always_inline))<br>
> +void lttng_inline_memcpy(void *dest, const void *src,<br>
> + unsigned long len)<br>
> +{<br>
> + switch (len) {<br>
> + case 1:<br>
> + *(uint8_t *) dest = *(const uint8_t *) src;<br>
> + break;<br>
> + case 2:<br>
> + *(uint16_t *) dest = *(const uint16_t *) src;<br>
> + break;<br>
> + case 4:<br>
> + *(uint32_t *) dest = *(const uint32_t *) src;<br>
> + break;<br>
> + case 8:<br>
> + *(uint64_t *) dest = *(const uint64_t *) src;<br>
> + break;<br>
> + default:<br>
> + inline_memcpy(dest, src, len);<br>
> + }<br>
> +}<br>
> +<br>
> /*<br>
> * Use the architecture-specific memcpy implementation for<br>
> constant-sized<br>
> * inputs, but rely on an inline memcpy for length statically<br>
> unknown.<br>
> @@ -603,7 +625,7 @@ do { <br>
> \<br>
> if (__builtin_constant_p(len)) \<br>
> memcpy(dest, src, __len); \<br>
> else \<br>
> - inline_memcpy(dest, src, __len); \<br>
> + lttng_inline_memcpy(dest, src, __len); \<br>
> } while (0)<br>
> /*<br>
> <br>
> ----------------------------------------------------------------------------------------------------<br>
> Here is one example of a crash, where 0xf3b7eeb2 is not 8 byte<br>
> aligned;<br>
> <br>
> (gdb) bt full<br>
> #0 lttng_inline_memcpy (len=8, src=0xf3b7eeb2, dest=<optimized<br>
> out>) at<br>
> /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend_internal.h:610<br>
> No locals.<br>
> #1 lib_ring_buffer_write (len=8, src=0xf3b7eeb2,<br>
> ctx=0xf57c47d0, config=0xf737c560 <client_config>) at<br>
> /usr/src/debug/lttng-ust/2.9.1/git/libringbuffer/backend.h:100<br>
> __len = 8<br>
> handle = 0xf3b2e0c0<br>
> backend_pages = <optimized out><br>
> chanb = 0xf3b2e2e0<br>
> offset = <optimized out><br>
> #2 lttng_event_write (ctx=0xf57c47d0, src=0xf3b7eeb2, len=8) at<br>
> /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust/lttng-ring-buffer-metadata-client.h:267<br>
> No locals.<br>
> #3 0xf7337ef8 in ustctl_write_one_packet_to_channel<br>
> (channel=<optimized out>, metadata_str=0xf3b7eeb2 "",<br>
> len=<optimized out>) at<br>
> /usr/src/debug/lttng-ust/2.9.1/git/liblttng-ust-ctl/ustctl.c:1183<br>
> ctx = {chan = 0xf3b2e290, priv = 0x0, handle =<br>
> 0xf3b2e0c0, data_size = 8, largest_align = 1, cpu = -1, buf =<br>
> 0xf6909000, slot_size = 8, buf_offset = 163877, pre_offset =<br>
> 163877, tsc = 0, rflags = 0, ctx_len = 80, ip = 0x0, priv2 =<br>
> 0x0, padding2 = '\000' <repeats 11 times>, backend_pages =<br>
> 0xf690c000}<br>
> chan = 0xf3b2e4d8<br>
> str = 0xf3b7eeb2 ""<br>
> reserve_len = 8<br>
> ret = <optimized out><br>
> __func__ = '\000' <repeats 34 times><br>
> __PRETTY_FUNCTION__ = '\000' <repeats 34 times><br>
> ---Type <return> to continue, or q <return> to quit---<br>
> #4 0x000344cc in commit_one_metadata_packet<br>
> (stream=stream@entry=0xf3b2e560) at ust-consumer.c:2206<br>
> write_len = <optimized out><br>
> ret = <optimized out><br>
> __PRETTY_FUNCTION__ = "commit_one_metadata_packet"<br>
> #5 0x00036538 in lttng_ustconsumer_read_subbuffer<br>
> (stream=stream@entry=0xf3b2e560, ctx=ctx@entry=0x25e6e8) at<br>
> ust-consumer.c:2452<br>
> len = 4096<br>
> subbuf_size = 4093<br>
> padding = <optimized out><br>
> err = -11<br>
> write_index = 1<br>
> ret = <optimized out><br>
> ustream = <optimized out><br>
> index = {offset = 0, packet_size = 575697416355872,<br>
> content_size = 17564043391468256584, timestamp_begin =<br>
> 17564043425827782792, timestamp_end = 34359738496,<br>
> events_discarded = 313327092840, stream_id = 4089446416,<br>
> stream_instance_id = 17689097291346376608, packet_seq_num =<br>
> 8589934593}<br>
> __PRETTY_FUNCTION__ = "lttng_ustconsumer_read_subbuffer"<br>
> __func__ = "lttng_ustconsumer_read_subbuffer"<br>
> .....<br>
> <br>
> _______________________________________________<br>
> lttng-dev mailing list<br>
> <a href="mailto:lttng-dev@lists.lttng.org" target="_blank" rel="noreferrer">lttng-dev@lists.lttng.org</a> <mailto:<a href="mailto:lttng-dev@lists.lttng.org" target="_blank" rel="noreferrer">lttng-dev@lists.lttng.org</a>><br>
> <a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev" rel="noreferrer noreferrer" target="_blank">https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev</a><br>
> <<a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev" rel="noreferrer noreferrer" target="_blank">https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev</a>><br>
> <br>
> <br>
> -- <br>
> Mathieu Desnoyers<br>
> EfficiOS Inc.<br>
> <a href="http://www.efficios.com" rel="noreferrer noreferrer" target="_blank">http://www.efficios.com</a> <<a href="http://www.efficios.com" rel="noreferrer noreferrer" target="_blank">http://www.efficios.com</a>><br>
> <br>
<br>
-- <br>
Mathieu Desnoyers<br>
EfficiOS Inc.<br>
<a href="https://www.efficios.com" rel="noreferrer noreferrer" target="_blank">https://www.efficios.com</a><br>
<br>
</blockquote></div></div></div>