[lttng-dev] Userspace tracing in docker containers
Jonathan Rajotte-Julien
jonathan.rajotte-julien at efficios.com
Tue Apr 6 10:07:53 EDT 2021
Hi,
On Mon, Apr 05, 2021 at 11:09:39AM -0700, Eqbal via lttng-dev wrote:
> Hi,
>
> I am trying to get user space tracing working for an application running in
> a docker container. I am running lttng session daemon in another container.
> I mounted the unix socket locations (either /var/run/lttng for root or
> $HOME/.lttng for another user). By doing that I can run commands like lttng
> create or lttng list <session-name>, but the tracepoint events from the
> application don't get registered and there is no trace output.
>
> I enabled LTTNG_UST_DEBUG an ran lttng-sessiond in verbose mode (-vvv and
> --verbose-consumer) and got the following error message:
>
> "*Unix socket credential pid=0. Refusing application in distinct,
> non-nested pid namespace.*"
>
> It appears that for some calls to the session daemon there is a getsockopt
> syscall made with *SO_PEERCRED* which returns 0 for pid and the call is
> failed with *LTTNG_UST_ERR_PEERCRED_PID* error (see get_cred call in
> ustctl.c).
>
> If I comment out the getsockopt call, my application tracing starts to work.
>
> From what I found, docker cannot support getsockopt/SO_PEERCRED call to get
> peer pid on the unix socket which would make sense as it's in a separate
> namespace.
>
> I have a few questions on this:
> 1. What is the reason for the get_cred/getsockopt call with SO_PEERCRED? I
> would like to understand why it's required for some and not other calls.
More information is found in the introducing commit:
commit a834901f2890deadb815d7f9e3ab79c3ba673994
Author: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
Date: Mon Oct 12 16:52:03 2020 -0400
Fix: Use unix socket peercred for pid, uid, gid credentials
Currently, the session daemon trust the pid, ppid, uid, and gid values
passed by the application, but should really validate the uid using unix
socket peercred. This fix uses the peercred values rather than the
values provided by the application on registration for:
- pid, uid and gid on Linux,
- uid and gid on FreeBSD.
This should improve how the session daemon deals with containerized
applications on Linux as well. Applications are required to be either in
the same pid namespace, or in a pid namespace nested within the pid
namespace of the lttng-sessiond, so the session daemon can map the
application pid to something meaningful within its own pid namespace.
Applications in a unrelated (disjoint) pid namespace will be refused by
the session daemon.
About the uid and gid with user namespaces on Linux, those will provide
meaningful IDs if the application user namespace is either the same as
the user namespace of the session daemon, or a nested user namespace.
Otherwise, the IDs will be that of /proc/sys/kernel/overflowuid and
/proc/sys/kernel/overflowgid, which typically maps to nobody.nogroup on
current distributions.
Given that fetching the parent pid (ppid) of the application would
require to use /proc/<pid>/status (which is racy wrt pid reuse), expose
the ppid provided by the application on registration instead, but only
in situations where the application sits in the same pid namespace as
the session daemon (on Linux), which is detected by checking if the pid
provided by the application matches the pid obtained using unix socket
credentials. The ppid is only used for logging/debugging purposes in the
session daemon anyway, so it is OK to use the value provided by the
application for that purpose.
Fixes: #1286
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
Change-Id: I94742e57dad642106908d09e2c7e395993c2c48f
As for "why it's required for some and not other calls.", there is a difference
between communicating with a lttng-sessiond daemon (using the lttng CLI) and
userspace application registering. They are essentially two distinct
communication interface. Now, to be honest, I'm not certain of the complete
"security" policy for the lttng-sessiond <-> CLI interface and if we should be
more strict or not.
> 2. Is there any workaround for this problem, so that I can get this to work
> with the container topology I am working with (app in one container and
> lttng daemons in another).
Based on the commit message, lttng-ust explicitly cannot be used across
non-nested pid namespace.
Could you give us more information on the goal for the topology you plan to use?
This could lead to further discussion and/or alternative solution based on the
goal and constraints of your deployment.
> 3. Related to 2, are there any gotchas to bypassing the getsockopt call in
> get_cred?
Based on the content of the mentioned bug (1286) [1], the principal concern is:
"
This means a non-root application could theoretically impersonate a root
application from a tracing perspective, and thus access root tracing buffers in
a per-uid configuration, which is unwanted.
"
[1] https://bugs.lttng.org/issues/1286
Cheers
--
Jonathan Rajotte-Julien
EfficiOS
More information about the lttng-dev
mailing list