[lttng-dev] Kernel Lock Analysis

aleix arocanon at bsc.es
Thu Apr 5 05:17:57 EDT 2018


Hello List! This is my first message here :)

I would like to share my experience playing with Kernel lock
instrumentation, LTTng and Trace Compass which might be useful for
others when analyzing applications behavior.

Currently, the Linux kernel features a set of lock tracepoints which
track lock acquisition and contention [1]:

 - lock_acquire
 - lock_acquired
 - lock_contended
 - lock_release

However, the current version of LTTng does not instrument this
tracepoints (the code is in the source code but commented [2]). The
guys on LTTng IRC helped me enabling them again by just uncommenting
the code.

Thanks to these tracepoints, I could create a couple of Trace Compass
views (also attached in this mail) to easily track the kernel lock
contention, which clearly showed me what was going on. Please, see the
attached screenshot for an example. The application under analysis is
a parallel cholesky benchmark run on a server with 56 CPUs. I was
trying to figure out why almost all application threads became blocked
at some point as seen in the screenshot. The lock view showed that
there was a huge contention on the mm->mmap_sem lock when all threads
tried to allocate memory by calling mmap(), mmprotect() and triggered
page faults when data is written on the recently mmapped memory.

Hence, what I would like to point out is how useful it has been for me
to enable the LTTng lock tracepoints. I think it would be great if
they could be added back again into mainland. If this can be done, I
think it woulde make sense to propose the Trace Compass people to
include a kernel lock view.

It looks like the lock events are by far the most frequent events,
quickly filling LTTng buffers. However, they are only generated if the
kernel is compiled with CONFIG_LOCK_STAT so this should not annoy the
unaware user.

I hope my experience is of help! Thanks a lot for your work! :D

 [1] See Documentation/locking/lockstat.txt on the Linux Kernel source
     for more information.
 [2] As Compudj pointed out, the reason is found in this conversation:
     https://lists.lttng.org/pipermail/lttng-dev/2012-December/019256.html


http://bsc.es/disclaimer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cholesky-lock-analysis.png
Type: image/png
Size: 150163 bytes
Desc: not available
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20180405/12de6c27/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lock_contention_analysis.xml
Type: application/xml
Size: 4479 bytes
Desc: not available
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20180405/12de6c27/attachment-0002.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: per_lock_analysis.xml
Type: application/xml
Size: 2909 bytes
Desc: not available
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20180405/12de6c27/attachment-0003.xml>


More information about the lttng-dev mailing list