[ltt-dev] [RFC UST and LTTNG] Daemon model proposal v0.3

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Mon Jan 31 14:06:34 EST 2011


* David Goulet (david.goulet at polymtl.ca) wrote:
> Everything about a per-user ltt-sessiond will be added to the RFC!
>
> Comments below.
>
> On 11-01-31 12:10 PM, Mathieu Desnoyers wrote:
>> * David Goulet (david.goulet at polymtl.ca) wrote:
>>> This unique identifier SHOULD be a unique string consisting of the username, a
>>> session name and the number of already created session plus one. The last
>>> number makes sure that the ID is unique in the ltt-sessiond context. In the
>>> implementation, a simple number could be use for performance purpose instead of
>>> comparing string at each iteration.
>>>
>>> Ex: Username - "dave", Session name - "mysession", Num. sessions - fifth one
>>>      -->  ID: dave.mysession-5
>>
>> We will run into problems across reboots: session IDs might be reused, thus
>> trying to overwrite older traces. Other ideas, possibly including UUID, would be
>> welcome.
>
> Across reboot... I don't see the problem reusing IDs... The only problem  
> to overwrite older traces is if the session ID is used in the trace path  
> on the disk which is not the case.

Then what do you propose using as directory name to hold the trace on disk ?
Using the session name would seem natural.

>
>>
>> My recommandation would be to allow two schemes:
>>
>> 1 - allow the user to specify the trace session name.
>>      (fails if the trace session name is already taken, or if the target trace is
>>      already there)
>
> Absolutely, session name is user defined unless not specified.
>
>> 2 - automatically assign trace session names with:
>>      username.UUID
>>
>> Thoughts ?
>
> I used the username in order to identified the session on a per user  
> basis. Then, as a unique string for the ID, session name and number of  
> sessions seems to me that no collision will happen... unless I'm missing  
> something.. ?

See remark above for the directory name on disk.

>
>>
>>> This daemon is also responsible for tracing buffer creation. Two main reasons
>>> motivate this design:
>>>
>>>      * The shared memory segment needs to be in the tracing group. We can only
>>>      assume that ltt-sessiond, being root (UID = 0), is always able to do that.
>>
>> As we discussed, I think this argument does not hold: any user can set the file
>> GID to "tracing", right ?
>
> Yeah you are right... Forgot that one. The shm credentials can be set  
> arbitrarily by any user. I will remove this reason.
>
>>
>>>
>>>      * The ltt-sessiond needs to keep track of all the shared memory segments in
>>>      order to be able to give reference to any other consumer since it is this
>>>      daemon that controls the access to all tracing buffers.
>>
>> I'm not convinced that ltt-sessiond needs to _create_ these buffers with this
>> argument, only that it needs some way to know what they are and to be able to
>> pass a reference to them to the consumers.
>
> Well, I think it's an important reason since this is the only way we  
> have to assign already created buffers to another consumer...

Maybe the most straightforward way, but not "the only" way: we could think of a
scheme where the inprocess library creates the buffers, and passes the file
descriptors to the sessiond through sockets. But it's an extra step that would
be good to avoid -- and my argument below supports the
session-registry-allocation of buffers.

>
>>
>> However, the following point might point us towards the requirement for
>> allocating the shared-mem buffers (shm) from the registry: if we ever want to
>> share the tracing buffers between all userspace applications (on a system where
>> security is less of a concern, but where scalability and performance is utterly
>> important), allocating them from the session registry will allow that very
>> easily, but if we allocate them from within the inprocess lib, then we will need
>> to redesign the whole thing.
>
> I'll add this.
>
>>
>>>
>>> A tracing session is associated to a user of the system (UID). Unless a user is
>>> in the tracing group, no other users can access the tracing data for that
>>> session.
>>
>> This is unclear, and does not seem to match our discussion. Access in terms of
>> "write" and "read" accesses to the buffers and control pipes are unclealy
>> specified. I would rather say:
>>
>
> Agree, I will clarify all of this in a Security section to make  
> everything clearer on that aspect.
>
>>> in order to trace the kernel, you are either root (UID=0) or in the tracing
>>> group.
>>
>> Tracing applications should also be allowed from root user (not just tracing
>> group).
>
> Yep but that section is about "ltt-consumerd" only so this is why there  
> is no mention about tracing applications :)

I guess I lost track of the context at this point then. Recalling that this
apply specifically to ltt-consumerd might be good.

>
>>
>> It should also check for its own user's ltt-sessiond named pipe (possibly in
>> /tmp/ltt-sessiond-username or /HOME/.ltt-sessiond ?).
>>
>> Cleanup of the named pipe when the sessiond dies is a very important aspect to
>> test, because we have to keep in mind that if the pipe is left there when there
>> is no sessiond active, the applications might be stucked. I would personally try
>> not to rely on the presence of the named pipe alone to indicate that sessiond is
>> there: if the named pipe exists and has the correct user, we should try to talk
>> to the sessiond (and fail if it does not respond), rather that waiting
>> endlessly. I'm afraid this is the only solid way to deal with improper cleanup
>> of ltt-sessiond (which is something might always happen, and which should not
>> hang all the traced applications).
>>
>> Same thing when we start a ltt-sessiond: if the named pipe already exists, we
>> should check if it is associated with an actual sessiond that responds to
>> commands.
>
> Yes. Very good point. I'll clarify.
>
>>> ltt-sessiond registry daemon for every trace action needed by the user.
>>
>> Actually, we might want to allow various applications to do tracing, drtrace
>> being the main one. If we modularize the pieces of drtrace into various
>> libraries, it will be easier to re-use for others.
>
> Yep.
>
>>> Scenario 2 - Single user tracing already running app_1
>>>
>>> 1) drtrace ask ltt-sessiond for a new session through a Unix socket. If
>>> allowed, ltt-sessiond returns a session ID to the client.
>>>
>>> +-----------+    ops     +--------------+
>>> | drtrace A |<---------->| ltt-sessiond |
>>> +-----------+            +--------------+
>>>                             ^
>>>          +-------+   read   |
>>>          | app_1 |----------+
>>>          +-------+
>>>
>>> NOTE: At this stage, since app_1 is already running, the registration of app_1
>>> to ltt-sessiond has already been done. However, the shared memory segment is
>>> not allocated yet until a trace session is initiated. Having no shared memory,
>>> the inprocess library of app_1 will wait on a name pipe connected to
>>> ltt-sessiond for the reference.
>>
>> I would argue that even if the app has a set of buffers being used by an active
>> tracing session, it should still wait on its named pipe for ltt-sessiond
>> connexions for two reasons:
>>
>> - Connexion may come from the per-user sessiond (local ltt-sessiond)
>> - We might create another concurrent tracing session, which can ask for yet
>>    another set of tracing buffers to be created. It seems more convenient to use
>>    the same communication scheme independently of if there is 0 or 1 active
>>    tracing session.
>>
>
> If I understand you correctly, you are saying that the app_1 should  
> ALWAYS wait on the ltt-sessiond named pipe? If yes, it's the case. I  
> will add some clarification here.

Yes, I would say that the application should always be waiting for new tracing
sessions to come up (from any of the global or per-user ltt-sessiond), and this
should be the exact same mechanism whether there is already a set of buffers
actively tracing or not (no behavior difference between 0 -> 1 session and N ->
N+1 sessions). It will make testing much easier.

> That was a hell of a reply! Cheers! :)

You're welcome!

Mathieu

> David
>
> -- 
> David Goulet
> LTTng project, DORSAL Lab.
>
> PGP/GPG : 1024D/16BD8563
> BE3C 672B 9331 9796 291A  14C6 4AF7 C14B 16BD 8563

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list