[lttng-dev] tracing multithread user program and API support for enabling/disabling events and for adding/removing context fields

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Thu Dec 20 16:23:02 EST 2018


It will impact tracing of _all_ threads of _all_ processes tracked by the targeted tracing session. 
"lttng_enable_event()" is by no mean a "fast" operation. It is a tracing control operation meant to 
be performed outside of fast-paths. 

Changing the design of LTTng from per-cpu to something else would be a significant endeavor. 

Thanks, 

Mathieu 

----- On Dec 20, 2018, at 3:27 PM, Yonghong Yan <yanyh15 at gmail.com> wrote: 

> Apologize for the wrong terms. I will ask in another way: I have multithread
> code, and if a thread calls lttng_enable_event (...), will it impact only the
> calling thread, or the threads spawned after that call, or all the threads of
> the process?

> Got your answer about vtid context. It is similar to what I am doing. We want to
> analyze behavior of all user threads. In the current LTTng, we need have that
> vtid field for each event even if it is rare situation that a thread migrate,
> and also for analyzing the traces, we need to check each records and sort the
> traces according to the vtid. It impacts the performance of both tracing and
> analysis. If I want to change the way of how traces are fed to the buffer in
> LTTng, how complicated will it be? I am guessing I will need to at least
> replace sched_getcpu with vtid (or alike so all the user threads are numbered
> from 0), and/or have the ring buffer bind to the user thread, and more.

> Yonghong

> On Thu, Dec 20, 2018 at 2:49 PM Mathieu Desnoyers < [
> mailto:mathieu.desnoyers at efficios.com | mathieu.desnoyers at efficios.com ] >
> wrote:

>> Hi,

>> Can you define what you mean by "per-user-thread tracepoint" and
>> "whole-user-process" ? AFAIK
>> those concepts don't appear anywhere in the LTTng documentations.

>> Thanks,

>> Mathieu

>> ----- On Dec 19, 2018, at 6:06 PM, Yonghong Yan < [ mailto:yanyh15 at gmail.com |
>> yanyh15 at gmail.com ] > wrote:

>>> Got another question about lttng_enable_event(): Using this API will impact
>>> per-user-thread tracepoint or the whole-user-process? I am thinking of the
>>> whole process, but want to confirm.

>>> Yonghong

>>> On Wed, Dec 19, 2018 at 4:20 PM Mathieu Desnoyers < [
>>> mailto:mathieu.desnoyers at efficios.com | mathieu.desnoyers at efficios.com ] >
>>> wrote:

>>>> Hi Yonghong,

>>>> ----- On Dec 19, 2018, at 1:19 PM, Yonghong Yan < [ mailto:yanyh15 at gmail.com |
>>>> yanyh15 at gmail.com ] > wrote:

>>>>> We are experimenting LTTng for tracing multi-threaded program, it works very
>>>>> well for us. Thank you for having this great tool. But we have some concerns
>>>>> about the overhead and scalability of the tracing. Could you share some insight
>>>>> of the following questions?
>>>>> 1. The session domain communicates with the user application via Unix domain
>>>>> socket, from LTTng document. is the communication frequent, such as each event
>>>>> requires communication, or the communication just happens at the beginning to
>>>>> configure user space tracing?

>>>> This Unix socket is only for "control" of tracing (infrequent communication).
>>>> The high-throughput tracing data goes through a shared memory map (per-cpu
>>>> buffers).

>>>>> 2. For the consumer domain, is the consumer domain has a thread per CPU/channel
>>>>> to write to disk or relay the traces, or it is a single threaded-process
>>>>> handling all the channels and ring buffers, which could become a bottleneck if
>>>>> we have large number of user threads all feeding traces?

>>>> Each consumer daemon is a single thread at the moment. It could be improved by
>>>> implementing a multithreaded design in the future. It should help especially in
>>>> NUMA setups, where having the consumer daemon on the same NUMA node as the ring
>>>> buffer it reads from would minimize the amount of remote NUMA accesses.

>>>> Another point is cases where I/O is performed to various target locations
>>>> (different network interfaces or disks). When all I/O goes through the same
>>>> interface, the bottleneck becomes the block device or the network interface.
>>>> However, for scenarios involving many network interfaces or block devices, then
>>>> multithreading the consumer daemon could become useful.

>>>> This has not been a priority for anyone so far though.

>>>>> 3. In the one channel/ring buffer per CPU setting, if a user thread migrates
>>>>> from one CPU to another, are the traces generated by that user thread fed to
>>>>> the two channels/buffers for the two CPUs?

>>>> The event generated will belong to the per-cpu buffer on which the
>>>> "sched_getcpu()" invocation occurs for the event. It is only saved into a
>>>> single per-cpu buffer, even if the thread is migrated to a different CPU before
>>>> it completes writing the event. This effectively creates infrequent situations
>>>> where threads write into other cpu's per-cpu buffers. Note that the "reserve"
>>>> and "commit" operations are smp-safe in lttng-ust for that reason.

>>>>> 4. So far, the events for tracing can be enabled and disabled from command line,
>>>>> are you considering to have runtime options (APIs) to enable or disable certain
>>>>> events? Or this is the feature that already in or can be implemented in
>>>>> different way?

>>>> We already expose a public LGPLv2.1 API for this. See lttng-tools:

>>>> include/lttng/event.h: lttng_enable_event()

>>>> It is implemented by liblttng-ctl.so

>>>>> 5. For context field, from the document, context fields cannot be removed from a
>>>>> channel once you add it. I would like to request a feature to allow removing
>>>>> context fields in the user program.

>>>> That's unfortunately not that simple. The channel context field belongs to the
>>>> channel, which maps to the "stream" description in the resulting CTF metadata
>>>> in the output trace. That stream description is invariant once it has been
>>>> created.

>>>> So currently the only way to remove a context would be to destroy your tracing
>>>> session and create a new one.

>>>> Thanks for your interest in LTTng!

>>>> Mathieu

>>>>> Thank you very much.
>>>>> Yonghong

>>>>> _______________________________________________
>>>>> lttng-dev mailing list
>>>>> [ mailto:lttng-dev at lists.lttng.org | lttng-dev at lists.lttng.org ]
>>>>> [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev |
>>>>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ]

>>>> --
>>>> Mathieu Desnoyers
>>>> EfficiOS Inc.
>>>> [ http://www.efficios.com/ | http://www.efficios.com ]

>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> [ http://www.efficios.com/ | http://www.efficios.com ]

> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20181220/b5fe6d0a/attachment.html>


More information about the lttng-dev mailing list