[lttng-dev] tracing multithread user program and API support for enabling/disabling events and for adding/removing context fields

Yonghong Yan yanyh15 at gmail.com
Thu Jan 10 18:33:50 EST 2019


Mathieu,

Thank you for your comments. I have another issue with tracing a
multiple-threaded runtime. I saw lost events at the end of a thread
execution. I have been debugging the code for two days and could not find a
reason. The lost events happen at the end of a thread execution. The thread
is put to suspended mode using pthread_cond_wait after the tracepoint of
the lost events is triggered. I know the tracepoint was triggered, using
printf to test. What could be the reason that cause this event lost? Any
details about how the tracepoint is triggered and event record is sent to
the buffer will help me debugging. For example, are they using signal
handler to send event record asynchronously? I used babletrace to list the
events.

The code is too complicated and I will try reproduce using a small program.

Thank you
Yonghong


On Thu, Dec 20, 2018 at 4:23 PM Mathieu Desnoyers <
mathieu.desnoyers at efficios.com> wrote:

> It will impact tracing of _all_ threads of _all_ processes tracked by the
> targeted tracing session.
> "lttng_enable_event()" is by no mean a "fast" operation. It is a tracing
> control operation meant to
> be performed outside of fast-paths.
>
> Changing the design of LTTng from per-cpu to something else would be a
> significant endeavor.
>
> Thanks,
>
> Mathieu
>
>
> ----- On Dec 20, 2018, at 3:27 PM, Yonghong Yan <yanyh15 at gmail.com> wrote:
>
>
> Apologize for the wrong terms. I will ask in another way: I have
> multithread code, and if a thread calls lttng_enable_event (...), will it
> impact only the calling thread, or the threads spawned after that call, or
> all the threads of the process?
>
> Got your answer about vtid context. It is similar to what I am doing. We
> want to analyze behavior of all user threads. In the current LTTng, we need
> have that vtid field for each event even if it is rare situation that a
> thread migrate, and also for analyzing the traces, we need to check each
> records and sort the traces according to the vtid. It impacts the
> performance of both tracing and analysis. If I want to change the way of
> how traces are fed to the buffer in LTTng, how complicated will it be? I am
> guessing I will need to at least replace sched_getcpu with vtid (or alike
> so all the user threads are numbered from 0), and/or have the ring buffer
> bind to the user thread, and more.
>
> Yonghong
>
>
> On Thu, Dec 20, 2018 at 2:49 PM Mathieu Desnoyers <
> mathieu.desnoyers at efficios.com> wrote:
>
>> Hi,
>>
>> Can you define what you mean by "per-user-thread tracepoint" and
>> "whole-user-process" ? AFAIK
>> those concepts don't appear anywhere in the LTTng documentations.
>>
>> Thanks,
>>
>> Mathieu
>>
>> ----- On Dec 19, 2018, at 6:06 PM, Yonghong Yan <yanyh15 at gmail.com>
>> wrote:
>>
>> Got another question about lttng_enable_event(): Using this API will
>> impact per-user-thread tracepoint or the whole-user-process? I am thinking
>> of the whole process, but want to confirm.
>>
>> Yonghong
>>
>>
>> On Wed, Dec 19, 2018 at 4:20 PM Mathieu Desnoyers <
>> mathieu.desnoyers at efficios.com> wrote:
>>
>>> Hi Yonghong,
>>>
>>> ----- On Dec 19, 2018, at 1:19 PM, Yonghong Yan <yanyh15 at gmail.com>
>>> wrote:
>>>
>>> We are experimenting LTTng for tracing multi-threaded program, it works
>>> very well for us. Thank you for having this great tool. But we have some
>>> concerns about the overhead and scalability of the tracing. Could you share
>>> some insight of the following questions?
>>> 1. The session domain communicates with the user application via Unix
>>> domain socket, from LTTng document. is the communication frequent, such as
>>> each event requires communication, or the communication just happens at the
>>> beginning to configure user space tracing?
>>>
>>> This Unix socket is only for "control" of tracing (infrequent
>>> communication). The high-throughput tracing data goes through a shared
>>> memory map (per-cpu buffers).
>>>
>>> 2. For the consumer domain, is the consumer domain has a thread per
>>> CPU/channel to write to disk or relay the traces, or it is a single
>>> threaded-process handling all the channels and ring buffers, which could
>>> become a bottleneck if we have large number of user threads all feeding
>>> traces?
>>>
>>> Each consumer daemon is a single thread at the moment. It could be
>>> improved by implementing a multithreaded design in the future. It should
>>> help especially in NUMA setups, where having the consumer daemon on the
>>> same NUMA node as the ring buffer it reads from would minimize the amount
>>> of remote NUMA accesses.
>>>
>>> Another point is cases where I/O is performed to various target
>>> locations (different network interfaces or disks). When all I/O goes
>>> through the same interface, the bottleneck becomes the block device or the
>>> network interface. However, for scenarios involving many network interfaces
>>> or block devices, then multithreading the consumer daemon could become
>>> useful.
>>>
>>> This has not been a priority for anyone so far though.
>>>
>>> 3. In the one channel/ring buffer per CPU setting, if a user thread
>>> migrates from one CPU to another, are the traces generated by that user
>>> thread fed to the two channels/buffers for the two CPUs?
>>>
>>> The event generated will belong to the per-cpu buffer on which the
>>> "sched_getcpu()" invocation occurs for the event. It is only saved into a
>>> single per-cpu buffer, even if the thread is migrated to a different CPU
>>> before it completes writing the event. This effectively creates infrequent
>>> situations where threads write into other cpu's per-cpu buffers. Note that
>>> the "reserve" and "commit" operations are smp-safe in lttng-ust for that
>>> reason.
>>>
>>> 4. So far, the events for tracing can be enabled and disabled from
>>> command line, are you considering to have runtime options (APIs) to enable
>>> or disable certain events? Or this is the feature that already in or can be
>>> implemented in different way?
>>>
>>> We already expose a public LGPLv2.1 API for this. See lttng-tools:
>>>
>>> include/lttng/event.h: lttng_enable_event()
>>>
>>> It is implemented by liblttng-ctl.so
>>>
>>> 5. For context field, from the document, context fields cannot be
>>> removed from a channel once you add it. I would like to request a feature
>>> to allow removing context fields in the user program.
>>>
>>> That's unfortunately not that simple. The channel context field belongs
>>> to the channel, which maps to the "stream" description in the resulting CTF
>>> metadata in the output trace. That stream description is invariant once it
>>> has been created.
>>>
>>> So currently the only way to remove a context would be to destroy your
>>> tracing session and create a new one.
>>>
>>> Thanks for your interest in LTTng!
>>>
>>> Mathieu
>>>
>>>
>>>
>>> Thank you very much.
>>> Yonghong
>>>
>>>
>>>
>>> _______________________________________________
>>> lttng-dev mailing list
>>> lttng-dev at lists.lttng.org
>>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>>>
>>>
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> http://www.efficios.com
>>>
>>
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
>>
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20190110/3afb0391/attachment-0001.html>


More information about the lttng-dev mailing list