<p>
Hi,
</p>
<p>
I put my test scripts in the attachment.
</p>
<p>
You can just run the script directly, create the trace session with the "--live" option on the 8core server,
</p>
<p>
then you will find the cpu usage of the lttng-consumerd process reached 10% or more.
</p>
<p>
<br>
</p>
<p>
About the streaming mode of lttng,I did the test before, it worked well.
</p>
<p>
When I create the trace session with "lttng create my-session --output=/tmp/my-kernel-trace", or with "lttng create my-session --set-url=net://ip",
</p>
<p>
the number of CPU seems not affect the cpu usage with lttng-consumerd.
</p>
<p>
It seems that only live-mode will be affected.
</p>
<p>
<br>
</p>
<p>
thanks,
</p>
<p>
yuhua
</p>
<br>
<blockquote name="replyContent" class="ReferenceQuote" style="padding-left:5px;margin-left:5px;border-left:#b6b6b6 2px solid;margin-right:0;">
-----原始邮件-----<br>
<b>发件人:</b><span id="rc_from">"Jonathan Rajotte-Julien" <jonathan.rajotte-julien@efficios.com></span><br>
<b>发送时间:</b><span id="rc_senttime">2020-11-28 00:04:23 (星期六)</span><br>
<b>收件人:</b> "熊毓华" <xiongyuhua@zju.edu.cn><br>
<b>抄送:</b> lttng-dev <lttng-dev@lists.lttng.org><br>
<b>主题:</b> Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process<br>
<br>
<div style="font-family:arial, helvetica, sans-serif;font-size:12pt;color:#000000;">
<div>
<br>
</div>
<hr id="zwchr" data-marker="__DIVIDER__">
<div data-marker="__HEADERS__">
<blockquote style="border-left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;">
<b>From: </b>"熊毓华" <xiongyuhua@zju.edu.cn><br>
<b>To: </b>"Jonathan Rajotte-Julien" <jonathan.rajotte-julien@efficios.com>, "lttng-dev" <lttng-dev@lists.lttng.org><br>
<b>Sent: </b>Friday, November 27, 2020 10:32:07 AM<br>
<b>Subject: </b>Re: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process<br>
</blockquote>
</div>
<div data-marker="__QUOTED_TEXT__">
<blockquote style="border-left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;">
<div>
Hi,Dear.
</div>
</blockquote>
<div>
<br>
</div>
<div>
Side note, you can remove the "Dear" here. ;)
</div>
<div>
<br data-mce-bogus="1">
</div>
<blockquote style="border-left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;">
<div>
<br>
<br>
The test script was used to generate some common fileIO,netIO events.
</div>
</blockquote>
<div>
<br>
</div>
<div>
Please provide a complete code repository if possible. So that we can at least have a baseline for reproduction.
</div>
<div>
<br data-mce-bogus="1">
</div>
<blockquote style="border-left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;">
<div>
<br>
<br>
On all servers, the monitoring strategy I set up when I start lttng is the same, monitoring all fileIO, netIO and some related system calls. <br>
The following table records the amount of events generated by the test script per minute, and one babeltrace record represents one event.<br>
<p>
<img width="800" height="256" title="" align="" alt="" style="white-space:normal;" src="https://mail.zju.edu.cn/coremail/s/json?func=mbox:getComposeData&sid=*&composeId=1606487214111&attachId=2" saveddisplaymode="">
</p>
</div>
</blockquote>
<div>
<br>
</div>
<div>
For some reason the image does not load here. Please provide a text based alternative for this figure.
</div>
<div>
<br data-mce-bogus="1">
</div>
<blockquote style="border-left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;">
<div>
<p>
<br>
</p>
<p>
The unit of the number is every ten thousand events per minute. And the number were read out after parsing by babeltrace.<br>
In addition, the server1 is 1core4G, server2 is 2core8G, server3 is 4core16G, server4 and server5 are 8core16G.<br>
<br>
It can be seen that the average amount of data generated per minute on all servers is roughly the same.However, the CPU usage of the lttng-consumerd process behave differently on server4 and server5, as I mentioned in my last email.
</p>
<p>
<br>
In addition, the usage of cpu is recorded using the "top" command.
</p>
<p>
<br>
</p>
<p>
My test concluded that, while the same number of events collected, lttng-consumerd process need to consume more cpu on the 8-core server.<br>
<br>
I want to know why is this and what else information do you need?
</p>
</div>
</blockquote>
<div>
<br>
</div>
<div>
Well we also want to know why! You will understand that albeit we develop lttng we do not always have a quick and easy answer to all problems. Performance related problem are always tricky.
</div>
<div>
And we also have to keep in mind that we do not necessarily optimize for low-cpu usage on the lttng-consumerd side.
</div>
<div>
<br data-mce-bogus="1">
</div>
<div>
We have to take a look at what "work" scale with the number of CPU on the lttng-consumerd side. One such thing is the live timer which is fired on an interval (default is 1s (1000000us)).
</div>
<div>
<br data-mce-bogus="1">
</div>
<div>
You could test this hypothesis by streaming the trace instead of using the live feature.
</div>
<div>
<br data-mce-bogus="1">
</div>
<div>
lttng create --set-url ....
</div>
<div>
<br data-mce-bogus="1">
</div>
<div>
Cheers
</div>
<blockquote style="border-left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;">
<div>
<p>
<br>
<br>
Looking forward to your reply.<br>
thanks,<br>
yuhua.
</p>
<br>
> -----原始邮件-----<br>
> 发件人: "Jonathan Rajotte-Julien" <<span class="Object" role="link" id="OBJ_PREFIX_DWT61_ZmEmailObjectHandler"><span class="Object" role="link" id="OBJ_PREFIX_DWT68_ZmEmailObjectHandler">jonathan.rajotte-julien@efficios.com</span></span>><br>
> 发送时间: <span class="Object" role="link" id="OBJ_PREFIX_DWT62_com_zimbra_phone"><a href="callto:2020-11-27 22" on_click="window.top.Com_Zimbra_Phone.unsetOnbeforeunload()">2020-11-27 22</a></span>:05:48 (星期五)<br>
> 收件人: "熊毓华" <<span class="Object" role="link" id="OBJ_PREFIX_DWT63_ZmEmailObjectHandler"><span class="Object" role="link" id="OBJ_PREFIX_DWT69_ZmEmailObjectHandler">xiongyuhua@zju.edu.cn</span></span>><br>
> 抄送: <span class="Object" role="link" id="OBJ_PREFIX_DWT64_ZmEmailObjectHandler"><span class="Object" role="link" id="OBJ_PREFIX_DWT70_ZmEmailObjectHandler">lttng-dev@lists.lttng.org</span></span><br>
> 主题: Re: [lttng-dev] Some confusion about cpu usage of the lttng-consumerd process<br>
> <br>
> Hi,<br>
> <br>
> On <span class="Object" role="link" id="OBJ_PREFIX_DWT65_com_zimbra_date"><span class="Object" role="link" id="OBJ_PREFIX_DWT71_com_zimbra_date">Fri, Nov 27</span></span>, 2020 at 02:39:28PM +0800, 熊毓华 via lttng-dev wrote:<br>
> > Hi,dear.<br>
> > <br>
> > I have been using lttng to monitor my server these days,but I found something interesting.<br>
> > <br>
> > The cpu usage of lttng varies with the number of cpu cores of the server.<br>
> <br>
> Which is a bit expected since more CPU means more "data" source from the point<br>
> of view of lttng hence more "work" overall.<br>
> <br>
> > <br>
> > On the server, I create a tracing session in live mode, using "lttng create my-session --live". <br>
> > <br>
> > Then,I Start the babeltrace2 and configure it to connect to the relay daemon,using "--input-format=lttng-live" mode.<br>
> > <br>
> > I used 5 cloud servers,1core4G 2core8G 4core16G 8core16G 8core16G.<br>
> > <br>
> > And,the same test script was executed above to provide the same workload.<br>
> <br>
> We would need the test script to have some context here of the workload.<br>
> <br>
> > <br>
> > As we all know,lttng has 5 processes,<br>
> > <br>
> > 1.lttng-runas --daemonize<br>
> > <br>
> > 2.lttng-runas -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing<br>
> <br>
> Based on this you are performing kernel tracing.<br>
> <br>
> > <br>
> > 3.lttng-sessiond --daemonize<br>
> > <br>
> > 4.lttng-relayd -L tcp://localhost:5344<br>
> > <br>
> > 5.lttng-consumerd -k --consumerd-cmd-sock /var/run/lttng/kconsumerd/command --consumerd-err-sock /var/run/lttng/kconsumerd/error --group tracing<br>
> > <br>
> > <br>
> > The CPU usage of the first four processes is below 2% on the 5 servers,but the lttng-consumerd process is different.<br>
> > <br>
> > On 1-core、2-core、4-core servers,the CPU usage of the lttng-consumerd process is below 2%.<br>
> <br>
> How is the cpu usage measured here?<br>
> <br>
> > <br>
> > But on two 8-core machines, the cpu usage of the lttng-consumerd process reached 10% or more.<br>
> <br>
> Consumerd is responsible of "fetching" data from the ring buffers and "saving"<br>
> it either locally (trace on disk) or remotely (streaming/live session). CPU usage<br>
> should be a bit correlated with the event production rate. Did you have a look at the<br>
> number of events generated for a similar interval?<br>
> <br>
> > And,the cpu usage of the babeltrace process is not much different,just the cpu usage of the lttng-consumerd process varies with the number of cpu cores of the server.<br>
> > <br>
> > Why it is like this?How should this phenomenon be analyzed?<br>
> > <br>
> > Looking forward to your reply.<br>
> > <br>
> > thanks,<br>
> > yuhua<br>
> > <br>
> > _______________________________________________<br>
> > lttng-dev mailing list<br>
> > <span class="Object" role="link" id="OBJ_PREFIX_DWT66_ZmEmailObjectHandler"><span class="Object" role="link" id="OBJ_PREFIX_DWT72_ZmEmailObjectHandler">lttng-dev@lists.lttng.org</span></span><br>
> > <span class="Object" role="link" id="OBJ_PREFIX_DWT67_com_zimbra_url"><span class="Object" role="link" id="OBJ_PREFIX_DWT73_com_zimbra_url"><a target="_blank" href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev">https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev</a></span></span><br>
> <br>
> <br>
> -- <br>
> Jonathan Rajotte-Julien<br>
> EfficiOS<br>
<br>
</div>
<br>
</blockquote>
</div>
</div>
</blockquote>