[lttng-dev] [EXTERNAL] Re: LTTng enable-event performance issue during active session

Kienan Stewart kstewart at efficios.com
Tue Oct 17 11:39:14 EDT 2023


Hi Zach,

the change to address the enable-event performance has been merged into 
both lttng-ust master and stable-2.13 if you would like to build from 
source before the next release.

thanks,
kienan


On 2023-10-09 06:29, Kramer, Zach wrote:
> Hi Kienan,
> 
> Thank you for the thorough explanation and proposed changeset.
> 
> My use case was to synchronize with a user-configurable set of events at runtime, which was easiest done individually.
> In the meantime I worked around the performance hit by ensuring all available events have already been created on the session before starting it.
> 
> Looking forward to the changeset.
> 
> Thanks,
> Zach
> 
> -----Original Message-----
> From: Kienan Stewart <kstewart at efficios.com>
> Sent: Friday, 6 October 2023 18:13
> To: lttng-dev at lists.lttng.org; Kramer, Zach <Zach.Kramer at cognex.com>
> Subject: [EXTERNAL] Re: [lttng-dev] LTTng enable-event performance issue during active session
> 
> Hi Zach,
> 
> there's a proposed change to lttng-ust which addresses the issue you are seeing, but it hasn't finished clearing review & testing yet.
> https://review.lttng.org/c/lttng-ust/+/11006
> 
> thanks,
> kienan
> 
> On 2023-10-05 18:04, Kienan Stewart via lttng-dev wrote:
>> Hi Zach
>>
>> apologies for the delay in replying to your question.
>>
>> On 2023-09-26 08:27, Kramer, Zach via lttng-dev wrote:
>>> Hi,
>>>
>>> I am observing a performance issue with regards to enabling events
>>> while a session is active and was wondering if this is expected.
>>>
>>> LTTng versions:
>>>
>>>    * lttng-tools: 2.13.9
>>>    * lttng-ust: 2.13.6
>>> Steps to reproduce:
>>>
>>>   1. Ensure many userspace tracepoints are available in `lttng list -u`
>>>      e.g. 100
>>>   2. Create a new session
>>>   3. Start session
>>>   4. Enable new events on session
>>>
>>> The time it takes to enable each new event has increasing cost e.g.
>>>
>>>   1. Event 1: 1ms
>>>   2. Event 100: 15ms
>>>   3. Event 1000: 150ms
>>>   4. àin total about 1.5 minutes to enable 1000 events
>>>
>>> If either:
>>>
>>>   1. No userspace tracepoints are available  2. Or session is not
>>> started until /after /the events are enabled
>>>
>>> Then the time it takes to enable new events is constant (e.g. 1ms).
>>>
>>> Below is a bash script demonstrating this behavior:
>>>
>>> # Pre-requisite: have many userspace tracepoints available
>>>
>>> lttng create foo
>>>
>>> lttng enable-channel -u -s foo bar
>>>
>>> lttng start foo
>>>
>>> total_t1=$(date +%s%3N);
>>>
>>> for i in {1..100}
>>>
>>> do
>>>
>>>           t1=$(date +%s%3N);
>>>
>>>           lttng enable-event -u lttng_iter_$i -s foo -c bar >
>>> /dev/null
>>>
>>>           t2=$(date +%s%3N);
>>>
>>>           echo "Event #$i took $((t2-t1)) ms"
>>>
>>> done
>>>
>>> total_t2=$(date +%s%3N);
>>>
>>> echo "----------------------------------------------------"
>>>
>>> echo "Enabling events on active session took $((total_t2-total_t1)) ms"
>>>
>>> echo "----------------------------------------------------"
>>>
>>> lttng destroy foo
>>>
>>> lttng create foo
>>>
>>> lttng enable-channel -u -s foo bar
>>>
>>> total_t1=$(date +%s%3N);
>>>
>>> for i in {1..100}
>>>
>>> do
>>>
>>>           t1=$(date +%s%3N);
>>>
>>>           lttng enable-event -u lttng_iter_$i -s foo -c bar >
>>> /dev/null
>>>
>>>           t2=$(date +%s%3N);
>>>
>>>           echo "Event #$i took $((t2-t1)) ms"
>>>
>>> done
>>>
>>> total_t2=$(date +%s%3N);
>>>
>>> echo "----------------------------------------------------"
>>>
>>> echo "Enabling events on inactive session took $((total_t2-total_t1)) ms"
>>>
>>> echo "----------------------------------------------------"
>>>
>>> lttng destroy foo
>>>
>>> Is this reproducible for you? Any insight is appreciated.
>>>
>>
>> I'm able to reproduce what you're seeing, and it's not completely
>> unexpected based on my understanding of the architecture of lttng.
>>
>> When a session is active and an event rule is enabled/disabled, the
>> sessiond will notify each of the registered applications of the event
>> rule state changes. This communication is going through either unix or
>> tcp sockets, and that portion of the protocol seems to struggle with
>> many small changes as the protocol to communicate the changes doesn't
>> support batching.
>>
>> Enabling multiple events in a single call reduces some of the
>> overhead, but the changes will still be communicated serially. This
>> means a single call activating thousands events can still take a long
>> time to process (eg. `lttng enable-event -s foo -c bar -u
>> tp0,tp1,...,tpN`)
>>
>> Using glob patterns or `--all` will be significantly faster as the UST
>> applications can digest those types of event rule changes with a
>> single set of messages. In cases where you want most but not all
>> events, flipping the logic to enable "*" but add events or patterns to
>> omit with "--exclude" is going to be a better strategy.
>>
>> Do you have cases where you need to activate many of events but which
>> couldn't be covered by using glob patterns and/or exclusions?
>>
>> thanks,
>> kienan
>>
>>> Many thanks,
>>>
>>> Zach
>>>
>>>
>>> _______________________________________________
>>> lttng-dev mailing list
>>> lttng-dev at lists.lttng.org
>   _______________________________________________
>> lttng-dev mailing list
>> lttng-dev at lists.lttng.org


More information about the lttng-dev mailing list