[ltt-dev] Bug in LTTv statistics ?

Fri Jul 16 13:48:10 EDT 2010

>
> > > >
> > > > I try to understand why on one trace, is there still
> "page_fault_entry"
> > > in
> > > > TRAP mode. I checked the previous correction, and while it find the
> right
> > > > states for all events except those that LTTv put in "TRAP" mode. Am I
> > > > missing something else or is it a glitch in LTTv?
> > >
> > > No sure what you mean. Can you try to explain a little more ?
> > >
> >
> > I've made tests to verify my assumption. I load textdump in a custom
> parser
> > and it give me the mode of the "page_fault_entry" events. I then compare
> the
> > total of each mode to the number of "page_fault_entry" in each mode in
> the
> > statistics view. For one trace, the result was perfect but for another
> > trace, my parser didn't find any "page_fault_entry" events in the "TRAP"
> > mode like LTTv stats did. The others mode had the right total except one
> > whose total given by the parser was the one given by LTTv + the "page
> > fault_entry" in TRAP.
> >
> > Page_fault_entry by mode:
> > LTTv    Parser
> > 1          1          MODE_UNKNOWN
> > 971      999       USER_MODE             999(parser) = 971(LTTv)
> > +28(LTTv-TRAP)
> > 26        26         SYSCALL
> > 28        0           TRAP
>
> Hrm hrm. I'd guess that the nested trap (a trap within a trap) case, or
> the major page faults are not handled correctly. A major page fault
> does:
>
> userspace or kernel
>  - trap (page fault entry)
>  - reenable interrupts
>  - allow to be scheduled out while waiting for I/
> ...
>  - rescheduled
>  - disable interrupts
>  - iret (page fault exit)
>
> So you could look into those corner-cases to see if we did something
> fishy in there.
>
>  I've found something strange related :
Here is a part of the event list in the textdump (the first number being the
event number):
197217:kernel.page_fault_entry: 7013.274341280 (kernel_0), 1061, 1061,
nautilus, , 971, 0x0, TRAP { 0xb6ad9124, 0xb3e89000, 14, 0 }
197218:mm.add_to_page_cache: 7013.274351397 (mm_0), 1061, 1061, nautilus, ,
971, 0x0, TRAP { 299160, 8388609 }
197219:mm.add_to_page_cache: 7013.274358043 (mm_0), 1061, 1061, nautilus, ,
971, 0x0, TRAP { 299160, 8388609 }
197220:block.remap: 7013.274360846 (block_0), 1061, 1061, nautilus, , 971,
0x0, TRAP { 8388608, 9871488, 8388609, 8192, 0, 0 }
197221:block.bio_queue: 7013.274362766 (block_0), 1061, 1061, nautilus, ,
971, 0x0, TRAP { 9873536, 8192, 0, 0 }
197222:block.getrq_bio: 7013.274367878 (15megs/block_0), 1061, 1061,
nautilus, , 971, 0x0, TRAP { 9873536, 8192, 0, 0 }
197223:block.rq_insert_fs: 7013.274371287 (Traces/15megs/block_0), 1061,
1061, nautilus, , 971, 0x0, TRAP { 9873536, 0, 0 }
197224:block.rq_issue_fs: 7013.274380821 (block_0), 1061, 1061, nautilus, ,
971, 0x0, TRAP { 9873536, 0, 0 }
197225:block.unplug_io: 7013.274476634 (block_0), 1061, 1061, nautilus, ,
971, 0x0, TRAP { 2 }
198050:kernel.sched_schedule: 7013.306179290
(/home/francois/workspace/Traces/15megs/kernel_0), 1061, 1061, nautilus, ,
971, 0x0, TRAP { 0, 1061, 0 }
198054:kernel.irq_exit: 7013.306261518
(/home/francois/workspace/Traces/15megs/kernel_0), 1061, 1061, nautilus, ,
971, 0x0, TRAP { 1 }
198057:kernel.softirq_exit: 7013.306304768
(/home/francois/workspace/Traces/15megs/kernel_0), 1061, 1061, nautilus, ,
971, 0x0, TRAP { 1 [run_timer_softirq+0x0/0x2c0] }
198058:kernel.page_fault_exit: 7013.306351291
(/home/francois/workspace/Traces/15megs/kernel_0), 1061, 1061, nautilus, ,
971, 0x0, USER_MODE { 516 }

Or, LTTv stats show 10 event unplug_io in SYSCALL, 1 in USER_MODE and 0 in
TRAP. (All the other events listed here are absent of TRAP too). It appear
that all these event take the USER_MODE which is the mode before and after.

>
> >
> > I've also made another discovery. It seems that the time between the
> > "page_fault_entry" or "page_fault_exit" and the previous event is never
> > added in the CPU_TIME. Is this intended?
>
> This is why we have the "cumulative cpu time", which keeps track of the
> amount of cpu time used by a context _and_ its nested contexts. I'm not
> 100% sure it work well though.
>
> Oops, I misexplained myself.

Suppose the sequence:
...
A:syscall_exit
B:page_fault_entry
C:unplug_io
D:page_fault_exit
E:syscall_entry
...

Then the time from A to B and from C to D seems to not be used for
calculation of cpu_time. The time from B to C and from D to E are used.
I've got those result experimently by excluding those time of my program and
remarquing the result where closer. Many lower sections have seen there
results becoming totally egal between my program and LTTv.
Is this behavior normal? It seems strange to me because it would meen that
the cpu_time will not be equal to the time between the first and the last
event when the trace occur only on one core/cpu. (LTTv result is not equal
to this time.)

Thanks for taking the time to answer all my questions.

-- 
François Godin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.casi.polymtl.ca/pipermail/lttng-dev/attachments/20100716/9a0dfda6/attachment-0003.htm>