[lttng-dev] [RFC] Per-user event ID allocation proposal

Thu Sep 13 10:29:52 EDT 2012

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hi Mathieu,

This looks good! I have some questions to clarify part of the RFC.

Mathieu Desnoyers:
> Mathieu Desnoyers September 11, 2012
> 
> Per-user event ID allocation proposal
> 
> The intent of this shared event ID registry is to allow sharing
> tracing buffers between applications belonging to the same user
> (UID) for UST (user-space) tracing.
> 
> A.1) Overview of System-wide, per-ABI (32 and 64-bit), per-user, 
> per-session, LTTng-UST event ID allocation:
> 
> - Modify LTTng-UST and lttng-tools to keep a system-wide, per-ABI
> (32 and 64-bit), per-user, per-session registry of enabled events
> and their associated numeric IDs. - LTTng-UST will have to register
> its tracepoints to the session daemon, sending the field typing of
> these tracepoints during registration, - Dynamically check that
> field types match upon registration of an event in this global
> registry, refuse registration if the field types do not match, -
> The metadata will be generated by lttng-tools instead of the 
> application.
> 
> A.2 Per-user Event ID Details:
> 
> The event ID registry is shared across all processes for a given 
> session/ABI/channel/user (UID). The intent is to forbid one user
> to access tracing data from another user, while keeping the
> system-wide number of buffers small.
> 
> The event ID registry is attached to a: - session, - specific ABI
> (32/64-bit), - channel, - user (UID).
> 
> lttng-session fill this registry by pulling this information as
> needed from traced processes (a.k.a. applications) to populate the
> registry. This information is needed only when an event is active
> for a created session. Therefore, applications need not to notify
> the sessiond if no session is created.
> 
> The rationale for using a "pull" scheme, where the sessiond pulls 
> information from applications, in opposition to a "push" scheme,
> where application would initiate commands to push the information,
> is that it minimizes the amount of logic required within
> liblttng-ust, and it does not require liblttng-ust to wait for
> reply from lttng-sessiond, which minimize the impact on the
> application behavior, providing application resilience to
> lttng-sessiond crash.
> 
> Updates to this registry are triggered by two distinct scenarios:
> either an "enable-event" command (could also be "start", depending
> on the sessiond design) is being executed, or, while tracing, a
> library is being loaded within the application.
> 
> Before we start describing the algorithms that update the registry,
> it is _very_ important to understand that an event enabled with 
> "enable-event" can contain a wildcard (e.g.: libc*) and loglevel,
> and therefore is associated to possibly _many_ events in the
> application.
> 
> Algo (1) When an "enable-event"/"start" command is executed, the
> sessiond will get, in return for sending an enable-event command to
> the application (which apply to a channel within a session), a
> variable-sized array of enabled events (remember, we can enable a
> wildcard!), along with their name, loglevel, field name, and field
> type. The sessiond proceeds to check that each event does not
> conflict with another event in the registry with the same name, but
> having different field names/types or loglevel. If its field
> names/typing or loglevel differ from a previous event, it prints a
> warnings. If it matches a previous event, it re-uses the same ID as
> the previous event. If no match, it allocates a new event ID. It
> sends a command to the application to let it know the mapping 
> between the event name and ID for the channel. When the
> application receives that command, it can finally proceed to attach
> the tracepoint probe to the tracepoint site.

> The sessiond keeps a per-application/per-channel hash table of 
> already enabled events, so it does not provide the same event
> name/id mapping twice for a given channel.

and per-session ?

Of what I understand of this proposal, an event is associated to
per-user/per-session/per-apps/per-channel values.

(I have a question at the end about how an event ID should be generated)

> 
> Algo (2) In the case where a library (.so) is being loaded in the
> application while tracing, the update sequence goes as follow: the
> application first checks if there is any session created. It so, it
> sends a NOTIFY_NEW_EVENTS message to the sessiond through the
> communication socket (normally used to send ACK to commands). The
> lttng-sessiond will therefore need to listen (read) to each
> application communication socket, and will also need to dispatch
> NOTIFY_NEW_EVENTS messages each time it expects an ACK reply for a
> command it has sent to the application.

Taking back the last sentence, can you explain more or clarify the
mechanism here of "dispatching a NOTIFY_NEW_EVENTS" each time an ACK
reply is expected?... Do you mean that each time we are waiting for an
ACK, if we get a NOTIFY instead (which could happen due to a race
between notification and command handling) you will launch a NOTIFY
code path where the session daemon check the events hash table and
check for event(s) to pull from the UST tracer? ... so what about
getting the real ACK after that ?

> When a NOTIFY_NEW_EVENTS is received from an application, the
> sessiond iterates on each session, each channel, redoing Algo (1). 
> The per-app/per-channel hash table that remembers already enabled
> events will ensure that we don't end up enabling the same event
> twice.
> 
> At application startup, the "registration done" message will only
> be sent once all the commands setting the mapping between event
> name and ID are sent. This ensures tracing is not started until all
> events are enabled (delaying the application for a configurable
> delay).
> 
> At library load, a "registration done" will also be sent by the
> sessiond some time after the NOTIFY_NEW_EVENTS has been received --
> at the end of Algo(1). This means that library load, within
> applications, can be delayed for the same amount of time that apply
> to application start (configurable).
> 
> The registry is emptied when the session is destroyed. Event IDs
> are never freed, only re-used for events with the same name, after
> loglevel, field name and field type match check.

This means that event IDs here should be some kind of a hash using a
combination of values of the event to make sure it's unique on a
per-event/per-channel/per-session basis ? (considering the sessiond
should keep them in a separate registry)

> 
> This registry is also used to generate metadata from the sessiond.
> The sessiond will now be responsible for generation of the metadata
> stream.
> 

This implies that the session daemon will need to keep track of the
global memory location of each applications in order to consumer
metadata streams ?

Cheers!
David
-----BEGIN PGP SIGNATURE-----

iQEcBAEBCgAGBQJQUe3dAAoJEELoaioR9I024JUH/14NpMtoMxKR+Y+oNd9AH6TW
Wj23HMJDwfhbRK8T1Mz6oMI/jWkSLVrJxoB3fh3Tbx5dJBwePXrkD+Da5NqV7MMV
PEc3Hqx66YYNh9EcbkYkg/LJfEmc4XwLxvi4x8DA4LIFwG9SDzs0N/i+e6pWs/Dy
blUFq/Kk7t9Ah72DCzjnSqQU+plW8Nr9wuxjpFV7uiXYpMrQsArhtuengtXmv7+7
R+KfokIccnKXZURTdEPg5aZg1NRc4QnOl8CPjX/rcD64N32EllRIIdoLFjHAxens
aGsv2U19/53J8nvMl93qswKWNyvt59yr3a7uobKnvmSZMMSyTgApZpxqaB9Y41I=
=cN91
-----END PGP SIGNATURE-----