[lttng-dev] introduction & usage of LTTNG in machinekit

Mon Apr 27 17:40:31 EDT 2015

> The problem at hand has these characteristics:
> 
> - it happens only on low-end ARM platforms
> - the code is part of the trajectory planner, which involves fp and
> transcendental functions
> - the code is executed from a period Xenomai thread and once in a while
> execution time exceeds the cycle time budget
> - no interrupts, system call, or Xenomai API call is involved, so kernel or
> ipipe tracing will not help to narrow it down

I suppose that the time period is in the order of 100us or more?

> my idea was to insert trace points in the tp code and one triggered on 'time
> window exceeded' so one could look backwards in history to determine which
> span of code used excessive time; does this sound reasonable?

In most cases the surprises come from interrupts and competition for resources. If everything is within the tp code, indeed internal tracepoints will help. If you want to be more fancy, there are a few more complicated options. You could trace in "flight recorder" mode, without recording to disk, and trigger a snapshot of the circular buffer content only when a cycle time budget overflow is detected. You could also have a timer, armed for the time window limit, that triggers a stack dump on the interrupted task when fired.

> Alex (in cc) has a test case which reproducibly triggers the problem

Rafaël will start next week. That would indeed be an interesting problem given that it is reproducible.

> That's pretty impressive. I'll try to reproduce an example and think through
> how this could fit into what we're doing.
> 
> Background is - we're getting more into supporting fieldbus-connected
> components, some of which might run our code, so distributed tracing would
> become an option
...
> The other area where I could find use for such a capability is UI/server
> interaction (could it be there might be much wider interest for this than
> just me?)

Indeed, as long as you can record a trace on each node with local timestamps, and have event pairs with a causal relationship in each direction, it should work. You bring the traces together for analysis and synchronize them using these event pairs.