[lttng-dev] Strange sequence of events

Fri Jan 23 14:07:43 EST 2015

----- Original Message -----

> From: "Mattias Johansson" <matjo75 at gmail.com>
> To: lttng-dev at lists.lttng.org
> Sent: Friday, January 23, 2015 4:29:57 AM
> Subject: [lttng-dev] Strange sequence of events

> Hi lttng-people!

> I have been using lttng to make a combined kernel and userspace trace and I
> can't make sense of a simple sequence of events. I am using a two core cpu,
> a custom distro (by yocto), linux kernel version is 3.12.15 with the PREEMPT
> RT patches, the lttng probe module lttng-probe-sched loaded.
> This is my strange sequence:

> [13:26:17.542825060] (+0.000003659) unit1 sched_switch: { cpu_id = 1 }, { pid
> = 2390 }, { prev_comm = "ThreadA", prev_tid = 2547, prev_prio = -19,
> prev_state = 1024, next_comm = "KernelThreadB", next_tid = 2410, next_prio =
> -51 }
> [13:26:17.542831008] (+0.000005948) unit1 sched_migrate_task: { cpu_id = 0 },
> { pid = 2408 }, { comm = "ThreadA", tid = 2547, prio = -19, orig_cpu = 1,
> dest_cpu = 0 }
> [13:26:17.542837060] (+0.000003256) unit1 ust_event:my_ust_event: { cpu_id =
> 1 }, { vtid = 2547 }, { }

> ThreadA makes the userspace trace event 'my_ust_event'.

> What is strange to me is either the timestamp of 'my_ust_event' is "wrong"
> (it should have been a time before the sched_switch), or the cpu_id of the
> ust_event is "wrong" (it should be 0 since ThreadA has been migrated to
> cpu_id 0 at 13:26:17.542831008).

> Or am I interpreting the sequence wrong?

> Or could it be that the my_ust_event itself got interrupted and had already
> sampled the cpu_id, and after the migration completed the event, thus the
> trace is "correct"?

This last guess is the most likely. UST ring buffer reserve code looks like: 

lttng_event_reserve() 
- read current CPU number (A) 
- lib_ring_buffer_try_reserve() 
do { 
1) lib_ring_buffer_clock_read() (B) 
2) cmpxchg() to reserve space in ring buffer (C) 
} while cmpxchg fails 

So if we are preempted and migrated between (A) and (B), we could 
see the effect you are experiencing. The issue here is that userspace 
does not have a way to atomically read cpu number, read time, and reserve 
space in the ring buffer with respect to the scheduler (it cannot disable preemption). 
We would need kernel-level support for this. I have a few ideas on how this 
could be done, but it's only at the idea stage at the moment, and would require 
extending the kernel. It's far from simple, because we cannot disable preemption 
nor migration from user-space (it's on purpose, and a strict requirement from 
the Linux scheduler developers). 

Thoughts ? 

Thanks, 

Mathieu 

> Thanks!
> /Mattias

> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lttng.org/pipermail/lttng-dev/attachments/20150123/edfe1968/attachment.html>