[lttng-dev] [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.

Wang Nan wangnan0 at huawei.com
Tue Jan 27 01:31:42 EST 2015


On 2015/1/27 13:31, Jérémie Galarneau wrote:
> On Mon, Jan 26, 2015 at 7:54 PM, Wang Nan <wangnan0 at huawei.com> wrote:
>> On 2015/1/27 0:25, Jérémie Galarneau wrote:
>>> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0 at huawei.com> wrote:
>>>> Hi folks,
>>>>
>>>> I'd like to share my first tuning experience with perf, ctf and
>>>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>>>> part of this mail is talking about my work. If you don't
>>>> interest in it, you can directly jump to conclusion part.
>>>>
>>>> *My Task*
>>>>
>>>> What I'm working on is finding the reason why CPU idle rate is high when
>>>> we benchmarking a database. I think it should be a very simple task:
>>>> tracing scheduling and system calls, finding the previous syscall issued
>>>> before idle, then based on statistics, collecting some user spaces call
>>>> stack, I can give an answer. I use perf to collect trace,
>>>> perf-convert-to-ctf to get ctf output and TraceCompass for
>>>> visualization.
>>>>
>>>>
>>>> *My Experience*
>>>>
>>>> First of all I use perf to collect trace:
>>>>
>>>>  # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>>>
>>>> then
>>>>
>>>>  # perf data convert --to-ctf out.ctf
>>>>
>>>> Which is simple. However, raw_syscalls:* tracepoints export less
>>>> information than syscalls:* tracepoints. Without them I have to manually
>>>> find syscall name from syscall id. I prefer to use:
>>>>
>>>>  # perf record -a -e sched:* -e syscalls:* sleep 1
>>>>
>>>> However there are some bugs and I have to make some patches. They are
>>>> posted and being disscussed currently, those bugs are still exist
>>>> upstream.
>>>>
>>>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>>>> 2598513 samples, which are collected during only 1 second execution. My
>>>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>>>> utilizes only 1 of them. I think this is another thing can be improved.
>>>
>>> Thanks for taking the time to do this write-up!
>>>
>>> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>>>
>>> Regards,
>>> Jérémie
>>>
>>
>> We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
>> conversion is possible to be parallelized without too much work. I suggest to consider
>> code tuning after we done that at perf side. What do you think?
> 
> I'm not sure I agree with the "without too much work" statement since
> Babeltrace's CTF Writer API is not thread-safe at this moment. As
> Alexandre previously mentionned, the size of the resulting CTF trace
> could be a good indication of whether or not IO is the culprit here.
> 
> We have never done any optimization work on Babeltrace and CTF Writer.
> We could be smarter on a number of front, notably with regards to
> memory allocation; there is currently no pooling of allocated objects
> (events, fields, types, etc.). We'll have to take a look at hard data
> before getting our hands dirty. However, I'm sure there are tons of
> optimization opportunities.
> 
> Jérémie
> 

Agree. I'll try to tune it when I have time, then we can get another
tuning experience.

Thank you.

>>
>>>>
>>>> The next step is visualization. Output ctf trace can be opened with
>>>> TraceCompass without problem. The most important views for me should be
>>>> resources view (I use them to check CPU usage) and control flow view (I
>>>> use them to check thread activities).
>>>>
>>>> The first uncomfortable thing is TraceCompass' slow response time. For
>>>> the trace I mentioned above, on resource view, after I click on CPU
>>>> idle area, I have to wait more than 10 seconds for event list updating
>>>> to get the previous event before the idle area.
>>>>
>>>> Then I found through resources view that perf itself tooks lots of CPU
>>>> time. In my case 33.5% samples are generated by perf itself. One core is
>>>> dedicated to perf and never idle or taken by others. I think this should
>>>> be another thing needs to be improved: perf should give a way to
>>>> blacklist itself when tracing all CPUs.
>>>>
>>>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>>>> changing point. I have to also catch raw_syscall:*, and which doubles
>>>> the number of samples.
>>>>
>>>> Finally I found the syscall which cause idle. However I need to write a
>>>> script to do statistics. TraceCompass itself is lack a mean to count
>>>> different events in my way.
>>>>
>>>> The next thing I should do is to find the calltrace which issue the
>>>> syscall. This time TraceCompass won't help, mostly because perf
>>>> convertion now doesn't support converting calltrace.
>>>>
>>>> *Conclusion*
>>>>
>>>> I suggest perf and TraceCompass to think about following improvements:
>>>>
>>>>  1. Reducing the cost of perf recording. There are one third events are
>>>>     generated by perf itself in my case. Is it possible that perf could
>>>>     provide an ability that blacklist itself and collect all other
>>>>     events?
>>>>
>>>>  2. Improving perf converting performance. Converting perf.data to CTF is
>>>>     slow, but it should be offline most of the time. We can utilize the
>>>>     abilities multi-core server to make it working in parallel.
>>>>
>>>>  3. Improving TraceCompass responding performance, especially when
>>>>     synchronizing different views.
>>>>
>>>>  4. Support converting userspace call trace. I think perf side should already
>>>>     have a plan on it.
>>>>
>>>>  5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>>>     support dwaring pre-defined events and processes. When I try to
>>>>     capture syscalls:*, I won't get benefit from TraceCompass because it
>>>>     doesn't know them. I believe that during system tuning we will
>>>>     finally get somewhere unable to be pre-defined by TraceCompass
>>>>     designer. Therefore give users abilities to define their own events
>>>>     and model should be much helpful.
>>>>
>>>> Thank you.
>>>>
>>>> _______________________________________________
>>>> diamon-discuss mailing list
>>>> diamon-discuss at lists.linuxfoundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>>>
>>>
>>>
>>
>>
> 
> 
> 





More information about the lttng-dev mailing list