[lttng-dev] Changed scheduling when using lttng

Mats Liljegren liljegren.mats2 at gmail.com
Thu Apr 25 04:23:02 EDT 2013


I'm doing a benchmark application which intends to benchmark the
effects of using the full nohz mode. I initially tried ftrace, but it
had too much overhead, so I tried lttng instead. While I'm happy with
the overhead in lttng, it has another problem: My application never
executes in full nohz mode.

In my investigation of why, I came up with the following two logs,
first using ftrace with sched_switch event enabled:

       testapp-2048  [001] d... 50744.932281: sched_switch:
prev_comm=testapp prev_pid=2048 prev_prio=139 prev_state=R ==>
next_comm=kworker/1:1 next_pid=25 next_prio=120
     kworker/1:1-25    [001] d... 50744.932312: sched_switch:
prev_comm=kworker/1:1 prev_pid=25 prev_prio=120 prev_state=S ==>
next_comm=ksoftirqd/1 next_pid=14 next_prio=120
>>>  ksoftirqd/1-14    [001] d... 50744.932343: sched_switch: prev_comm=ksoftirqd/1 prev_pid=14 prev_prio=120 prev_state=S ==> next_comm=testapp next_pid=2048 next_prio=139
<<<    testapp-2048  [001] d... 50747.789948: sched_switch:
prev_comm=testapp prev_pid=2048 prev_prio=139 prev_state=D ==>
next_comm=kworker/1:1 next_pid=25 next_prio=120
     kworker/1:1-25    [001] d... 50747.789978: sched_switch:
prev_comm=kworker/1:1 prev_pid=25 prev_prio=120 prev_state=S ==>
next_comm=testapp next_pid=2048 next_prio=139
       testapp-2048  [001] d... 50747.799103: sched_switch:
prev_comm=testapp prev_pid=2048 prev_prio=139 prev_state=R ==>
next_comm=kworker/1:1 next_pid=25 next_prio=120

The line ">>>" is when the testapp is starting its benchmarking, and
"<<<" when it has finished. As can be seen, the execution is done
uninterrupted by any schedule switched for the whole 30s period.

The second test was with lttng with sched_switch tracepoint enabled:

[07:20:04.813293457] (+0.000823975) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "swapper/1", prev_tid = 0, prev_prio = 20,
prev_state = 0, next_comm = "testapp", next_tid = 3027, next_prio = 39
}
[07:20:05.062438964] (+0.159973144) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "testapp", prev_tid = 3027, prev_prio = 39,
prev_state = 0, next_comm = "ksoftirqd/1", next_tid = 14, next_prio =
20 }
[07:20:05.062469482] (+0.000030518) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "ksoftirqd/1", prev_tid = 14, prev_prio = 20,
prev_state = 1, next_comm = "testapp", next_tid = 3027, next_prio = 39
}
[07:20:05.312438964] (+0.249969482) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "testapp", prev_tid = 3027, prev_prio = 39,
prev_state = 0, next_comm = "ksoftirqd/1", next_tid = 14, next_prio =
20 }
[07:20:05.312438964] (+0.000000000) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "ksoftirqd/1", prev_tid = 14, prev_prio = 20,
prev_state = 1, next_comm = "testapp", next_tid = 3027, next_prio = 39
}
[07:20:05.682434082] (+0.369995118) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "testapp", prev_tid = 3027, prev_prio = 39,
prev_state = 0, next_comm = "kworker/1:1", next_tid = 25, next_prio =
20 }
[07:20:05.682464599] (+0.000030517) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "kworker/1:1", prev_tid = 25, prev_prio = 20,
prev_state = 1, next_comm = "testapp", next_tid = 3027, next_prio = 39
}
[07:20:06.282440185] (+0.569976807) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "testapp", prev_tid = 3027, prev_prio = 39,
prev_state = 0, next_comm = "ksoftirqd/1", next_tid = 14, next_prio =
20 }
[07:20:06.282440185] (+0.000000000) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "ksoftirqd/1", prev_tid = 14, prev_prio = 20,
prev_state = 1, next_comm = "testapp", next_tid = 3027, next_prio = 39
}
[07:20:06.682434082] (+0.399993897) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "testapp", prev_tid = 3027, prev_prio = 39,
prev_state = 0, next_comm = "kworker/1:1", next_tid = 25, next_prio =
20 }
[07:20:06.682464599] (+0.000030517) pandaboard sched_switch: { cpu_id
= 1 }, { prev_comm = "kworker/1:1", prev_tid = 25, prev_prio = 20,
prev_state = 1, next_comm = "testapp", next_tid = 3027, next_prio = 39
}

and so on...

As can be seen, the scheduling is changed when lttng runs. It seems
like lttng makes the non-migratable kthreads kworker/1:1 and
ksoftirq/1 busy which totally destroyes measurements in my application
since nohz mode is never entered.

Is there a way around this? What is causing this behavior?

/Mats



More information about the lttng-dev mailing list