[ltt-dev] use percpu variable for ltt_nesting

Jiaying Zhang jiayingz at google.com
Wed Sep 3 01:25:34 EDT 2008

Hi Mathieu,

I found lttng sometimes has very poor performance on multiple cpu systems
and it seems
the more processors the system has, the more performance overhead I saw with
lttng enabled.
Here are the results I collected with tbench benchmark (

single processor:
   lttng disabled: 236.07 MB/sec
   lttng enabled:  210.569 MB/sec

16 processors:
   lttng disabled:  4173.83 MB/sec
   lttng enabled:  1766.77 MB/sec

After playing with the code for a while and asking around, I found this
issue is caused by the cpu contention while updating the ltt_nesting
variable used
in ltt/ltt-serialize.c:ltt_vtrace:

In the attachment is the patch that changes ltt_nesting into a per_cpu
variable. With the patch applied,
the tbench performance with lttng applied gets to about 3600 MB/sec on the
16 processor system.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.casi.polymtl.ca/pipermail/lttng-dev/attachments/20080902/061f93a9/attachment-0003.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ltt_nesting-percpu.patch
Type: text/x-patch
Size: 2103 bytes
Desc: not available
URL: <http://lists.casi.polymtl.ca/pipermail/lttng-dev/attachments/20080902/061f93a9/attachment-0003.bin>

More information about the lttng-dev mailing list