[lttng-dev] Some confusion about cpu usage of the lttng-consumerd process

熊毓华 xiongyuhua at zju.edu.cn
Sat Nov 28 01:49:28 EST 2020


Hi,

I put my test scripts in the attachment.

You can just run the script directly, create the trace session with the "--live" option on the 8core server,

then you will find the cpu usage of the lttng-consumerd process reached 10% or more.




About the streaming mode of lttng,I did the test before, it worked well.

When I create the trace session with "lttng create my-session --output=/tmp/my-kernel-trace", or with "lttng create my-session --set-url=net://ip",

the number of CPU seems not affect the cpu usage with lttng-consumerd.

It seems that only live-mode will be affected.




thanks,

yuhua


-----原始邮件-----
发件人:"Jonathan Rajotte-Julien" <jonathan.rajotte-julien at efficios.com>
发送时间:2020-11-28 00:04:23 (星期六)
收件人: "熊毓华" <xiongyuhua at zju.edu.cn>
抄送: lttng-dev <lttng-dev at lists.lttng.org>
主题: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process




From: "熊毓华" <xiongyuhua at zju.edu.cn>
To: "Jonathan Rajotte-Julien" <jonathan.rajotte-julien at efficios.com>, "lttng-dev" <lttng-dev at lists.lttng.org>
Sent: Friday, November 27, 2020 10:32:07 AM
Subject: Re: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process

Hi,Dear.


Side note, you can remove the "Dear" here. ;)




The test script was used to generate some common fileIO,netIO events.


Please provide a complete code repository if possible. So that we can at least have a baseline for reproduction.




On all servers, the monitoring strategy I set up when I start lttng is the same, monitoring all fileIO, netIO and some related system calls. 
The following table records the amount of events generated by the test script per minute, and one babeltrace record represents one event.




For some reason the image does not load here. Please provide a text based alternative for this figure.






The unit of the number is every ten thousand events per minute. And the number were read out after parsing by babeltrace.
In addition, the server1 is 1core4G, server2 is 2core8G, server3 is 4core16G, server4 and server5 are 8core16G.

It can be seen that the average amount of data generated per minute on all servers is roughly the same.However, the CPU usage of the lttng-consumerd process behave differently on server4 and server5, as I mentioned in my last email.


In addition, the usage of cpu is recorded using the "top" command.




My test concluded that, while the same number of events collected, lttng-consumerd process need to consume more cpu on the 8-core server.

I want to know why is this and what else information do you need?



Well we also want to know why! You will understand that albeit we develop lttng we do not always have a quick and easy answer to all problems. Performance related problem are always tricky.
And we also have to keep in mind that we do not necessarily optimize for low-cpu usage on the lttng-consumerd side. 


We have to take a look at what "work" scale with the number of CPU on the lttng-consumerd side. One such thing is the live timer which is fired on an interval (default is 1s (1000000us)).


You could test this hypothesis by streaming the trace instead of using the live feature.


lttng create --set-url ....


Cheers



Looking forward to your reply.
thanks,
yuhua.


> -----原始邮件-----
> 发件人: "Jonathan Rajotte-Julien" <jonathan.rajotte-julien at efficios.com>
> 发送时间: 2020-11-27 22:05:48 (星期五)
> 收件人: "熊毓华" <xiongyuhua at zju.edu.cn>
> 抄送: lttng-dev at lists.lttng.org
> 主题: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process
> 
> Hi,
> 
> On Fri, Nov 27, 2020 at 02:39:28PM +0800, 熊毓华 via lttng-dev wrote:
> > Hi,dear.
> > 
> > I have been using lttng to monitor my server these days,but I found something interesting.
> > 
> > The cpu usage of lttng varies with the number of cpu cores of the server.
> 
> Which is a bit expected since more CPU means more "data" source from the point
> of view of lttng hence more "work" overall.
> 
> > 
> > On the server, I create a tracing session in live mode, using "lttng create my-session --live". 
> > 
> > Then,I Start the babeltrace2 and configure it to connect to the relay daemon,using "--input-format=lttng-live" mode.
> > 
> > I used 5 cloud servers,1core4G 2core8G 4core16G 8core16G 8core16G.
> > 
> > And,the same test script was executed above to provide the same workload.
> 
> We would need the test script to have some context here of the workload.
> 
> > 
> > As we all know,lttng has 5 processes,
> > 
> > 1.lttng-runas    --daemonize
> > 
> > 2.lttng-runas      -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing
> 
> Based on this you are performing kernel tracing.
> 
> > 
> > 3.lttng-sessiond --daemonize
> > 
> > 4.lttng-relayd -L tcp://localhost:5344
> > 
> > 5.lttng-consumerd  -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing
> > 
> > 
> > The CPU usage of the first four processes is below 2% on the 5 servers,but the lttng-consumerd process is different.
> > 
> > On 1-core、2-core、4-core servers,the CPU usage of the lttng-consumerd process is below 2%.
> 
> How is the cpu usage measured here?
> 
> > 
> > But on two 8-core machines, the cpu usage of the lttng-consumerd process reached 10% or more.
> 
> Consumerd is responsible of "fetching" data from the ring buffers and "saving"
> it either locally (trace on disk) or remotely (streaming/live session). CPU usage
> should be a bit correlated with the event production rate. Did you have a look at the
> number of events generated for a similar interval?
> 
> > And,the cpu usage of the babeltrace process is not much different,just the cpu usage of the lttng-consumerd process varies with the number of cpu cores of the server.
> > 
> > Why it is like this?How should this phenomenon be analyzed?
> > 
> > Looking forward to your reply.
> > 
> > thanks,
> > yuhua
> > 
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev at lists.lttng.org
> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> 
> 
> -- 
> Jonathan Rajotte-Julien
> EfficiOS



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20201128/254099a6/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test script.rar
Type: application/octet-stream
Size: 1120 bytes
Desc: not available
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20201128/254099a6/attachment-0001.obj>


More information about the lttng-dev mailing list