[lttng-dev] Using lttng-ust 2.13.6 from Yocto Kirkstone and getting weird segfault saying strlen_asimd.S can't be found.

Kienan Stewart kstewart at efficios.com
Thu Jul 18 10:27:30 EDT 2024


Hi Brian,


On 7/17/24 12:27 PM, Brian Hutchinson wrote:
> 
> 
> On Wed, Jul 17, 2024 at 10:13 AM Kienan Stewart <kstewart at efficios.com 
> <mailto:kstewart at efficios.com>> wrote:
> 
>     Hi Brian,
> 
>     thanks! I'll take a look at these and see if I can get any ideas
>     from them.
> 
>     To go back to your original e-mail, a detail that I hadn't noticed in
>     your original backtrace was pointed out to me and I'd like to verify
>     with you:
> 
>     ```
>     Core was generated by `./my_app'.
>     Program terminated with signal SIGSEGV, Segmentation fault.
>     #0  __strlen_asimd () at ../sysdeps/aarch64/multiarch/strlen_asimd.S:96
>     96      ../sysdeps/aarch64/multiarch/strlen_asimd.S: No such file or
>     directory.
>     [Current thread is 1 (Thread 0xffffb8e2d040 (LWP 757))]
>     (gdb) bt
>     #0  __strlen_asimd () at ../sysdeps/aarch64/multiarch/strlen_asimd.S:96
>     #1  0x0000ffffb86c5330 in lttng_ust_tracepoint_module_register () from
>     /usr/lib/liblttng-ust-tracepoint.so.1
>     #2  0x0000aaaab4c8da18 in lttng_ust__tracepoints__ptrs_init () at
>     /opt/poky/4.0.18/sysroots/cortexa53-crypto-poky-linux/usr/include/lttng/tracepoint.h:629
>     #3  0x0000ffffb872b30c in call_init (env=<optimized out>,
>     argv=0xffffe2d66098, argc=1) at ../csu/libc-start.c:145
>     #4  __libc_start_main_impl (main=0xaaaab4bd94e0 <main>, argc=1,
>     argv=0xffffe2d66098, init=<optimized out>, fini=<optimized out>,
>     rtld_fini=<optimized out>, stack_end=<optimized out>) at
>     ../csu/libc-start.c:376
>     #5  0x0000aaaab4bd9230 in _start () at ../sysdeps/aarch64/start.S:81
>     ```
> 
>     The library loaded at frame 1 is
>     `/usr/lib/liblttng-ust-tracepoint.so.1`; however, in frame 2 the
>     reference is from
>     `/opt/poky/4.0.18/sysroots/cortexa53-crypto-poky-linux/usr/include/lttng/tracepoint.h`.
> 
> 
>     Are the LTTng libraries in `/usr/lib` the exact same version as the
>     headers in
>     `/opt/poky/4.0.18/sysroots/cortexa53-cryptop-poky-linux/usr/include/lttng`?
> 
> 
> Yes.  The sdk was created by yocto using populate_sdk which includes the 
> target image sysroot in the sdk.  Hopefully I explained that right. The 
> target image drives the contents of the sdk and the generated sdk is 
> what is used to cross compile the application specifically for the 
> target we are running on (NXP imx8mm).

I think that makes sense - the debug information has stored the path of 
the build environment which is the sysroot portion.

> 
> 
>     If some of your app is compiled using one version of headers and a
>     different version of the library is loaded at runtime, there could
>     be an
>     ABI mismatch.
> 
> 
>     You could check what the include paths are during compilation and
>     `LD_LIBRARY_PATH` at runtime. Running with the environment variable
>     `LD_DEBUG=bindings,libs` (see `man ld.so` for more info).
> 
> I've been down that road a few times before with the same line of 
> thinking (ABI issue) and believe I've proven I don't have those issues.  
> That's one of the reasons I hand built lttng-ust in native environment 
> (on the actual target) to verify it wasn't an issue with a yocto recipe, 
> cross compiling etc..
> 
> I've been thru this process with different OS versions (yocto Dunfell 
> before), kernel version, toolchain version etc., and now the same with 
> Kirkstone based components.
> 
>     In any case, I'll take a look over the most recent backtraces and
>     see if
>     anything else jumps out.
> 

I don't see anything more that stands out.

I'm going to step back a bit to make sure I have correct understanding 
of your situation.

Based on your previous statements, it sounded like you weren't sure if 
your application is statically linked or not. This is different than 
using a statically linked probe provider.

You can verify by running `ldd /path/to/my_app`. lttng-ust doesn't 
support being statically linked, it is always loaded dynamically. The 
trace probe providers (TPPs) may be statically linked, which is different.

You mentioned earlier that the building scenario you were using 
resembles the hello world example. To confirm my understanding, do you 
mean the hello-static-lib example in lttng-ust/doc/examples? In the 
docs, this is called "The instrumented application is statically linked 
with the tracepoint provider package archive file". Note that the final 
application still uses dynamic linking (`-ldl -llttng-ust`, and the 
absence of `-static -static-libc` when creating the final executable).

Further along in stepping back:

  - Does make check for lttng-ust pass in your environment?
  - Does make check for lttng-tools pass in your environment?
  - Is this reproducible in non-yocto environments or on other 
architectures with the same project?
  - Does running the traced application with `LTTNG_UST_DEBUG=1` yield 
more information?
  - I'd also run lttng-sessiond with the environment variable 
`LTTNG_UST_DEBUG=1` set and `-vvv --verbose-consumer` in the program 
arguments and capture for stdout & stderr into a file for analysis.
  - Verify if the main program is dynamically linked with `ldd`
  - Verify which libraries are loaded at runtime and which calls are 
shimmed with `LD_DEBUG=libs,binding`
  - Review in detail which gcc commands are executed to produce the 
tracepoint provider and link it to the main executable.

My current understanding is that the statedump tracepoints are 
registered and those events communicated before the main program's init 
is run (at least for C programs). If you do the following test, I think 
you should be able to see the statedumps. Could you confirm if you have 
them or not?


```
LTTNG_UST_DEBUG=1 lttng-sessiond -vvv --verbose-consumer &> 
/tmp/sessiond.log &
lttng create
lttng enable-event -u --all
lttng start

unset LTTNG_UST_WITHOUT_BADDR_STATEDUMP
unset LTTNG_UST_WITHOUT_PROCNAME_STATEDUMP
LTTNG_UST_DEBUG=1 LTTNG_UST_REGISTER_TIMEOUT=-1 ./my_app
# Segfault?

lttng stop
lttng view # Here you should have the statedump events
killall lttng-sessiond
```

> 
> Thanks!  I'm running out of things to try.

Ultimately, given the bespoke environment, build steps, and application 
it's tough to diagnose a lot of things that we would go over with a fine 
tooth comb: seeing how the TPPs are built and linked, seeing how the 
application is built and linked, analyzing what's happening at runtime, 
having a coredump that allows us to see the variables, etc.

If you can provide a minimal reproducer, it's possible to dig further 
into it.

Otherwise, to look in more detail at your specific project would be 
covered under a service contract (for which we can sign NDAs, etc. as 
needed). Feel free to reach out the sales at efficios.com to organise that.

> 
> I don't fully understand the implications of some of this documentation, 
> but I have learned enough to know LD_PRELOAD means nothing if everything 
> is static built.  Our application is heavily multi threaded, uses fork, 
> clone and who knows if it's doing double closes or not ... so I've tried 
> these LD_PRELOAD "helpers" or whatever they are, before and didn't get a 
> different result, but now know it's only for shared library object.  So 
> now I will experiment with making our tpp a shared object and try 
> LD_PRELOAD of fork, fd and pthread helpers and starting my app that way.

LD_PRELOAD has an effect if the application you are running is 
dynamically linked (note: the TPP can be statically linked inside an 
dynamically linked executable, and the preloads are still useful and/or 
needed in that case).

The function of the helpers are to ensure that various system calls 
don't clobber things in use by lttng-ust. E.g., close() is shimmed by 
liblttng-ust-fd.so so that FDs that lttng-ust uses aren't suddenly shut 
by the program.

If you want to see it in action, you could try an experiment with a 
slightly modified hello-static-lib application, with instructions in the 
top comment: 
https://gist.github.com/kienanstewart/299eb6a511d92d458569b210e4a418a8

The effect of the preloads such as liblttng-ust-fd.so are independent of 
how the TPP is built and linked to the application or library.

thanks,
kienan

> 
> This is what I'm referring to above from the lttng docs:
> 
> 
>           Use LTTng-UST with daemons
>           <https://lttng.org/docs/v2.13/#doc-using-lttng-ust-with-daemons>
> 
> If your instrumented application calls fork(2) 
> <https://man7.org/linux/man-pages/man2/fork.2.html>, clone(2) 
> <https://man7.org/linux/man-pages/man2/clone.2.html>, or BSD’s rfork(2) 
> <http://www.freebsd.org/cgi/man.cgi?query=rfork&sektion=2&manpath=FreeBSD+4.10-RELEASE>, without a following exec(3) <https://man7.org/linux/man-pages/man3/exec.3.html>-family system call, you must preload the |liblttng-ust-fork.so| shared object when you start the application.
> 
> LD_PRELOAD=liblttng-ust-fork.so ./my-app
> 
> If your tracepoint provider package is a shared library which you also 
> preload, you must put both shared objects in |LD_PRELOAD|:
> 
> LD_PRELOAD=liblttng-ust-fork.so:/path/to/tp.so ./my-app
> 
> 
>           Use LTTng-UST with applications which close file descriptors
>           that don’t belong to them
>           <https://lttng.org/docs/v2.13/#doc-liblttng-ust-fd>
> 
> Since 2.9
> 
> If your instrumented application closes one or more file descriptors 
> which it did not open itself, you must preload the |liblttng-ust-fd.so| 
> shared object when you start the application:
> 
> LD_PRELOAD=liblttng-ust-fd.so ./my-app
> 
> Typical use cases include closing all the file descriptors after fork(2) 
> <https://man7.org/linux/man-pages/man2/fork.2.html> or rfork(2) 
> <http://www.freebsd.org/cgi/man.cgi?query=rfork&sektion=2&manpath=FreeBSD+4.10-RELEASE> and buggy applications doing “double closes”.
> 
> 
> Thanks,
> 
> Brian
> 
> 


More information about the lttng-dev mailing list