[lttng-dev] [RFC] Per-user event ID allocation proposal

Wed Sep 12 08:35:55 EDT 2012

Mathieu Desnoyers
September 11, 2012

Per-user event ID allocation proposal

The intent of this shared event ID registry is to allow sharing tracing
buffers between applications belonging to the same user (UID) for UST
(user-space) tracing.

A.1) Overview of System-wide, per-ABI (32 and 64-bit), per-user,
     per-session, LTTng-UST event ID allocation:

- Modify LTTng-UST and lttng-tools to keep a system-wide, per-ABI (32 and
  64-bit), per-user, per-session registry of enabled events and their
  associated numeric IDs.
- LTTng-UST will have to register its tracepoints to the session daemon,
  sending the field typing of these tracepoints during registration,
- Dynamically check that field types match upon registration of an
  event in this global registry, refuse registration if the field
  types do not match,
- The metadata will be generated by lttng-tools instead of the
  application.

A.2 Per-user Event ID Details:

The event ID registry is shared across all processes for a given
session/ABI/channel/user (UID). The intent is to forbid one user to
access tracing data from another user, while keeping the system-wide
number of buffers small.

The event ID registry is attached to a:
  - session,
  - specific ABI (32/64-bit),
  - channel,
  - user (UID).

lttng-session fill this registry by pulling this information as needed
from traced processes (a.k.a. applications) to populate the registry.
This information is needed only when an event is active for a created
session. Therefore, applications need not to notify the sessiond if no
session is created.

The rationale for using a "pull" scheme, where the sessiond pulls
information from applications, in opposition to a "push" scheme, where
application would initiate commands to push the information, is that it
minimizes the amount of logic required within liblttng-ust, and it does
not require liblttng-ust to wait for reply from lttng-sessiond, which
minimize the impact on the application behavior, providing application
resilience to lttng-sessiond crash.

Updates to this registry are triggered by two distinct scenarios: either
an "enable-event" command (could also be "start", depending on the
sessiond design) is being executed, or, while tracing, a library is
being loaded within the application.

Before we start describing the algorithms that update the registry, it
is _very_ important to understand that an event enabled with
"enable-event" can contain a wildcard (e.g.: libc*) and loglevel, and
therefore is associated to possibly _many_ events in the application.

Algo (1)
When an "enable-event"/"start" command is executed, the sessiond will
get, in return for sending an enable-event command to the application
(which apply to a channel within a session), a variable-sized array of
enabled events (remember, we can enable a wildcard!), along with their
name, loglevel, field name, and field type. The sessiond proceeds to
check that each event does not conflict with another event in the
registry with the same name, but having different field names/types or
loglevel. If its field names/typing or loglevel differ from a previous
event, it prints a warnings. If it matches a previous event, it re-uses
the same ID as the previous event. If no match, it allocates a new event
ID. It sends a command to the application to let it know the mapping
between the event name and ID for the channel. When the application
receives that command, it can finally proceed to attach the tracepoint
probe to the tracepoint site. The sessiond keeps a
per-application/per-channel hash table of already enabled events, so it
does not provide the same event name/id mapping twice for a given
channel.

Algo (2)
In the case where a library (.so) is being loaded in the application
while tracing, the update sequence goes as follow: the application first
checks if there is any session created. It so, it sends a
NOTIFY_NEW_EVENTS message to the sessiond through the communication
socket (normally used to send ACK to commands). The lttng-sessiond will
therefore need to listen (read) to each application communication
socket, and will also need to dispatch NOTIFY_NEW_EVENTS messages each
time it expects an ACK reply for a command it has sent to the
application. When a NOTIFY_NEW_EVENTS is received from an application, 
the sessiond iterates on each session, each channel, redoing Algo (1).
The per-app/per-channel hash table that remembers already enabled events
will ensure that we don't end up enabling the same event twice.

At application startup, the "registration done" message will only be
sent once all the commands setting the mapping between event name and ID
are sent. This ensures tracing is not started until all events are
enabled (delaying the application for a configurable delay).

At library load, a "registration done" will also be sent by the sessiond
some time after the NOTIFY_NEW_EVENTS has been received -- at the end of
Algo(1). This means that library load, within applications, can be
delayed for the same amount of time that apply to application start
(configurable).

The registry is emptied when the session is destroyed. Event IDs are
never freed, only re-used for events with the same name, after loglevel,
field name and field type match check.

This registry is also used to generate metadata from the sessiond. The
sessiond will now be responsible for generation of the metadata stream.

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com