[lttng-dev] Some confusion about cpu usage of the lttng-consumerd process

Jonathan Rajotte-Julien jonathan.rajotte-julien at efficios.com
Fri Nov 27 11:04:23 EST 2020


> From: "熊毓华" <xiongyuhua at zju.edu.cn>
> To: "Jonathan Rajotte-Julien" <jonathan.rajotte-julien at efficios.com>,
> "lttng-dev" <lttng-dev at lists.lttng.org>
> Sent: Friday, November 27, 2020 10:32:07 AM
> Subject: Re: Re: [lttng-dev] Some confusion about cpu usage of the
> lttng-consumerd process

> Hi,Dear.

Side note, you can remove the "Dear" here. ;) 

> The test script was used to generate some common fileIO,netIO events.

Please provide a complete code repository if possible. So that we can at least have a baseline for reproduction. 

> On all servers, the monitoring strategy I set up when I start lttng is the same,
> monitoring all fileIO, netIO and some related system calls.
> The following table records the amount of events generated by the test script
> per minute, and one babeltrace record represents one event.

For some reason the image does not load here. Please provide a text based alternative for this figure. 

> The unit of the number is every ten thousand events per minute. And the number
> were read out after parsing by babeltrace.
> In addition, the server1 is 1core4G, server2 is 2core8G, server3 is 4core16G,
> server4 and server5 are 8core16G.

> It can be seen that the average amount of data generated per minute on all
> servers is roughly the same.However, the CPU usage of the lttng-consumerd
> process behave differently on server4 and server5, as I mentioned in my last
> email.

> In addition, the usage of cpu is recorded using the "top" command.

> My test concluded that, while the same number of events collected,
> lttng-consumerd process need to consume more cpu on the 8-core server.

> I want to know why is this and what else information do you need?

Well we also want to know why! You will understand that albeit we develop lttng we do not always have a quick and easy answer to all problems. Performance related problem are always tricky. 
And we also have to keep in mind that we do not necessarily optimize for low-cpu usage on the lttng-consumerd side. 

We have to take a look at what "work" scale with the number of CPU on the lttng-consumerd side. One such thing is the live timer which is fired on an interval (default is 1s (1000000us)). 

You could test this hypothesis by streaming the trace instead of using the live feature. 

lttng create --set-url .... 

Cheers 

> Looking forward to your reply.
> thanks,
> yuhua.
> > -----原始邮件-----
> > 发件人: "Jonathan Rajotte-Julien" < jonathan.rajotte-julien at efficios.com >
> > 发送时间: [ callto:2020-11-27 22 | 2020-11-27 22 ] :05:48 (星期五)
> > 收件人: "熊毓华" < xiongyuhua at zju.edu.cn >
> > 抄送: lttng-dev at lists.lttng.org
>> 主题: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd
> > process

> > Hi,

> > On Fri, Nov 27 , 2020 at 02:39:28PM +0800, 熊毓华 via lttng-dev wrote:
> > > Hi,dear.

>> > I have been using lttng to monitor my server these days,but I found something
> > > interesting.

> > > The cpu usage of lttng varies with the number of cpu cores of the server.

> > Which is a bit expected since more CPU means more "data" source from the point
> > of view of lttng hence more "work" overall.


>> > On the server, I create a tracing session in live mode, using "lttng create
> > > my-session --live".

>> > Then,I Start the babeltrace2 and configure it to connect to the relay
> > > daemon,using "--input-format=lttng-live" mode.

> > > I used 5 cloud servers,1core4G 2core8G 4core16G 8core16G 8core16G.

> > > And,the same test script was executed above to provide the same workload.

> > We would need the test script to have some context here of the workload.


> > > As we all know,lttng has 5 processes,

> > > 1.lttng-runas --daemonize

>> > 2.lttng-runas -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command
> > > --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing

> > Based on this you are performing kernel tracing.


> > > 3.lttng-sessiond --daemonize

> > > 4.lttng-relayd -L tcp://localhost:5344

>> > 5.lttng-consumerd -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command
> > > --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing


>> > The CPU usage of the first four processes is below 2% on the 5 servers,but the
> > > lttng-consumerd process is different.

>> > On 1-core、2-core、4-core servers,the CPU usage of the lttng-consumerd process is
> > > below 2%.

> > How is the cpu usage measured here?


>> > But on two 8-core machines, the cpu usage of the lttng-consumerd process reached
> > > 10% or more.

> > Consumerd is responsible of "fetching" data from the ring buffers and "saving"
>> it either locally (trace on disk) or remotely (streaming/live session). CPU
> > usage
>> should be a bit correlated with the event production rate. Did you have a look
> > at the
> > number of events generated for a similar interval?

>> > And,the cpu usage of the babeltrace process is not much different,just the cpu
>> > usage of the lttng-consumerd process varies with the number of cpu cores of the
> > > server.

> > > Why it is like this?How should this phenomenon be analyzed?

> > > Looking forward to your reply.

> > > thanks,
> > > yuhua

> > > _______________________________________________
> > > lttng-dev mailing list
> > > lttng-dev at lists.lttng.org
>> > [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev |
> > > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ]


> > --
> > Jonathan Rajotte-Julien
> > EfficiOS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20201127/1ef82b85/attachment.htm>


More information about the lttng-dev mailing list