[lttng-dev] Best way to analyze CTF files
Philippe Proulx
eeppeliteloop at gmail.com
Fri Oct 17 22:15:11 EDT 2014
On Fri, Oct 17, 2014 at 8:13 PM, Christian Babeux
<christian.babeux at efficios.com> wrote:
> Hi Sebastien,
>
> A simple Python script using the Babeltrace bindings would probably do
> the job here. See [1] for an example using the bindings.
Yes, here's a start Sébastien <http://pastebin.com/QXfev1ve>:
import babeltrace
import json
import sys
def get_msg_intervals(trace_path):
col = babeltrace.TraceCollection()
if col.add_trace(trace_path, 'ctf') is None:
raise RuntimeError('Cannot add trace')
msg_intervals = {}
for ev in col.events:
if ev.name not in ['message:actor_send', 'message:actor_receive']:
continue
msg_source_actor = ev['message_source_actor']
msg_number = ev['message_number']
if msg_source_actor not in msg_intervals:
msg_intervals[msg_source_actor] = {}
if msg_number not in msg_intervals[msg_source_actor]:
msg_intervals[msg_source_actor][msg_number] = {}
interval = msg_intervals[msg_source_actor][msg_number]
if ev.name == 'message:actor_send':
interval['send_ts'] = ev.timestamp
else:
interval['recv_ts'] = ev.timestamp
return msg_intervals
if __name__ == '__main__':
msg_intervals = get_msg_intervals(sys.argv[1])
print(json.dumps(msg_intervals, indent=2))
This will output a JSON dict mapping actors to message numbers, and then
each message number to a dict containing both the send and receive timestamps:
python3 intervals.py /path/to/trace > intervals.json
Once you have that JSON database, you may either read it manually or query it
easily using another Python script <http://pastebin.com/fLwyvEz8>:
import json
import sys
def print_msg_diff(filename, actor_id, msg_number):
def get_diff(msg):
send_ts = msg['send_ts']
recv_ts = msg['recv_ts']
if send_ts == -1 or recv_ts == -1:
return None
return recv_ts - send_ts
with open(filename) as f:
msg_intervals = json.load(f)
actor = msg_intervals[actor_id]
if msg_number is None:
for msg_number in sorted(actor.keys()):
msg = actor[msg_number]
print('{}: {}'.format(msg_number, get_diff(msg)))
else:
print(get_diff(actor[msg_number]))
if __name__ == '__main__':
filename = sys.argv[1]
actor_id = sys.argv[2]
msg_number = None
if len(sys.argv) > 3:
msg_number = sys.argv[3]
print_msg_diff(filename, actor_id, msg_number)
Use it like this:
python3 query.py <JSON file path> <actor ID> [<message number>]
For example:
python3 query.py intervals.json 1000019
python3 query.py intervals.json 1000059
python3 query.py intervals.json 1000019 14
Modify those scripts at will to fit your exact use case. Of course the versions
above could crash in many ways because they lack a few run time checks, etc.
Hope it helps,
Phil
>
> Thanks,
>
> Christian
>
> [1] - https://lttng.org/docs/#doc-viewing-and-analyzing-your-traces
>
> On Fri, Oct 17, 2014 at 6:23 PM, Boisvert, Sebastien <boisvert at anl.gov> wrote:
>> Bonjour,
>>
>> First, thank you for LTTng-UST. This is very useful and convenient.
>>
>> I just got started today using LTTng (LTTng-UST) for tracing a HPC application
>> that I am working on (I am a postdoc). I am impressed by how easy LTTng is to use it.
>>
>> In my system, an actor message is represented by a pair
>> <message_actor_source, message_number>.
>>
>> I want to list all messages that have a high delivery time (message:actor_receive - message:actor_send).
>>
>> I am doing this to get the messages of one actor (actor 1000019):
>>
>> [boisvert at bigmem biosal]$ babeltrace ~/lttng-traces/auto-20141017-181240|grep "message_source_actor = 1000019" > actor_1000019
>>
>> Then, I can look at one message with (message <1000019, 14>):
>>
>> [boisvert at bigmem biosal]$ grep "message_number = 14," actor_1000019
>> [18:12:43.647017211] (+0.000005110) bigmem.knoxville.kbase.us message:actor_send: { cpu_id = 30 }, { message_number = 14, message_action = 31592, message_count = 8, message_source_actor = 1000019, message_destination_actor = 1000059, message_source_node = -1, message_destination_node = -1 }
>> [18:12:43.647025249] (+0.000002860) bigmem.knoxville.kbase.us message:actor_receive: { cpu_id = 49 }, { message_number = 14, message_action = 31592, message_count = 8, message_source_actor = 1000019, message_destination_actor = 1000059, message_source_node = 3, message_destination_node = 3 }
>>
>> If I substract the times:
>>
>> irb(main):003:0> (43.647025249-43.647017211)*10**9
>> => 8038.00000426236
>>
>> This message (<1000019, 14>) required 8038 ns for the delivery. This one is fine.
>>
>>
>> So basically my question is:
>>
>> Is there an easy way to analyze these tracepoint files ?
>> _______________________________________________
>> lttng-dev mailing list
>> lttng-dev at lists.lttng.org
>> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
More information about the lttng-dev
mailing list