[lttng-dev] Best way to analyze CTF files

Fri Oct 17 19:43:01 EDT 2014

Hi Sébastien,

You could always write custom scripts using grep and sed, but those tend 
to be a bit inflexible. ;)

Have you heard about Trace Compass [1] (previously known as TMF, or "The 
Eclipse LTTng plugin") ? It's a generic trace viewer and analyzer based 
on Eclipse, and it supports LTTng traces. It's not officially "out" yet, 
because the project itself is still being set up. We don't have any 
download links on the new website, but in the meantime there are 
relatively recent builds at [2].

There are some default basic views for UST traces, like event list, 
statistics, memory usage if you have the right events enabled, etc. But 
in your case it seems you need something more specific. Depending on 
your level of wanting to dig into it, you could write an "analysis 
module" that would receive each event in the trace, and then check if 
they're of type "actor_receive" or "actor_send", and calculate the delay 
between two consecutive ones. Then aggregate the results and possibly 
print the worst offenders in a view (or even to the console if it's 
sufficient).

See [3] for how to setup the dev environment, and [4] for the analysis 
framework documentation.
Let us know if you need more information!

Cheers,
Alexandre

[1] http://eclipse.org/tracecompass
[2] http://secretaire.dorsal.polymtl.ca/~gbastien/TracingRCP/TraceCompass/
[3] https://wiki.eclipse.org/Trace_Compass/Development_Environment_Setup
[4] 
http://wiki.eclipse.org/Linux_Tools_Project/TMF/User_Guide#Analysis_Framework

On 10/17/2014 06:23 PM, Boisvert, Sebastien wrote:
> Bonjour,
>
> First, thank you for LTTng-UST. This is very useful and convenient.
>
> I just got started today using LTTng (LTTng-UST) for tracing a HPC application
> that I am working on (I am a postdoc). I am impressed by how easy LTTng is to use it.
>
> In my system, an actor message is represented by a pair
> <message_actor_source, message_number>.
>
> I want to list all messages that have a high delivery time (message:actor_receive - message:actor_send).
>
> I am doing this to get the messages of one actor (actor 1000019):
>
> [boisvert at bigmem biosal]$ babeltrace ~/lttng-traces/auto-20141017-181240|grep "message_source_actor = 1000019"  > actor_1000019
>
> Then, I can look at one message with (message <1000019, 14>):
>
> [boisvert at bigmem biosal]$ grep "message_number = 14," actor_1000019
> [18:12:43.647017211] (+0.000005110) bigmem.knoxville.kbase.us message:actor_send: { cpu_id = 30 }, { message_number = 14, message_action = 31592, message_count = 8, message_source_actor = 1000019, message_destination_actor = 1000059, message_source_node = -1, message_destination_node = -1 }
> [18:12:43.647025249] (+0.000002860) bigmem.knoxville.kbase.us message:actor_receive: { cpu_id = 49 }, { message_number = 14, message_action = 31592, message_count = 8, message_source_actor = 1000019, message_destination_actor = 1000059, message_source_node = 3, message_destination_node = 3 }
>
> If I substract the times:
>
> irb(main):003:0> (43.647025249-43.647017211)*10**9
> => 8038.00000426236
>
> This message (<1000019, 14>) required 8038 ns for the delivery. This one is fine.
>
>
> So basically my question is:
>
> Is there an easy way to analyze these tracepoint files ?
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev