[lttng-dev] ctf_sequence for raw data (byte stream), issues reading back with babeltrace
David Abdurachmanov
david.abdurachmanov at gmail.com
Fri Apr 28 11:47:28 UTC 2017
Hi,
I have decided to add raw byte stream into the tracepoint which contained zlib
compressed buffer.
I have defined the following tracepoint:
31 TRACEPOINT_EVENT(
32 root,
33 root_zlib_decompress,
34 TP_ARGS(
35 unsigned long, input_buffer_size_arg,
36 unsigned long, output_buffer_size_arg,
37 unsigned long, input_adler32_arg,
38 unsigned long, output_adler32_arg,
39 unsigned char *, input_data_arg
40 ),
41 TP_FIELDS(
42 ctf_integer_hex(unsigned long, input_adler32, input_adler32_arg)
43 ctf_integer_hex(unsigned long, output_adler32, output_adler32_arg)
44 ctf_integer(unsigned int, input_buffer_size, input_buffer_size_arg)
45 ctf_integer(unsigned long, output_buffer_size, output_buffer_size_arg)
46 ctf_sequence(unsigned char, input_data, input_data_arg, unsigned long, input_buffer_size_arg)
47 )
48 )
The majority of such dynamic raw byte buffers will be small (e.g. 11 bytes), but
some could got into megabytes range.
I managed to collected the data, around 1.4G in size.
I found that I couldn't view such data via babeltrace or its python bindings.
babeltrace (or python app) would eat 64G of RAM + 32G of SWAP and finally would
be killed.
From the debug output:
[..]
event {
name = "root:root_zlib_decompress";
id = 3;
stream_id = 0;
loglevel = 13;
fields := struct {
integer { size = 64; align = 8; signed = 0; encoding = none; base = 16; } _input_adler32;
integer { size = 64; align = 8; signed = 0; encoding = none; base = 16; } _output_adler32;
integer { size = 32; align = 8; signed = 0; encoding = none; base = 10; } _input_buffer_size;
integer { size = 64; align = 8; signed = 0; encoding = none; base = 10; } _output_buffer_size;
integer { size = 64; align = 8; signed = 0; encoding = none; base = 10; } __input_data_length;
integer { size = 8; align = 8; signed = 0; encoding = none; base = 10; } _input_data[ __input_data_length ];
};
};
[..]
[debug] new definition path: event.fields._input_data.[3517315]
[debug] new definition path: event.fields._input_data.[3517316]
[debug] new definition path: event.fields._input_data.[3517317]
[debug] new definition path: event.fields._input_data.[3517318]
[debug] new definition path: event.fields._input_data.[3517319]
[debug] new definition path: event.fields._input_data.[3517320]
[debug] new definition path: event.fields._inp
I also tried ctf_sequence_hex (this is what I would prefer) and
ctf_sequence_text.
ctf_sequence_hex has the same problem as ctf_sequence.
ctf_sequence_text works and I can view the data using babeltrace, but those
strings are a mess, because it's not actual string data.
Using Python binding I couldn't get to the data:
Traceback (most recent call last):
File "test2.py", line 21, in <module>
event['input_data'].encode()
File "/opt/rh/python33/root/usr/lib64/python3.3/site-packages/babeltrace.py", line 862, in __getitem__
return field.value
File "/opt/rh/python33/root/usr/lib64/python3.3/site-packages/babeltrace.py", line 1334, in value
value = _bt_python_get_sequence_string(self._d)
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 2-3: invalid continuation byte
Again, it's not UTF-8 encoded string.
How should I store char * data (raw byte stream) with dynamic size into event
and then later read it back.
I want to record some real-life data and use babeltrace Python bindings to
generate test cases.
I am using:
CentOS Linux release 7.3.1611 (Core)
kernel 3.10.0-514.10.2.el7.x86_64
babeltrace-1.5.2-1.el7.x86_64
python33-babeltrace-1.5.2-1.el7.x86_64
lttng-ust-devel-2.9.0-1.el7.x86_64
lttng-tools-devel-2.9.4-1.el7.x86_64
lttng-ust-java-agent-2.9.0-1.el7.x86_64
lttng-tools-2.9.4-1.el7.x86_64
lttng-ust-java-2.9.0-1.el7.x86_64
lttng-ust-2.9.0-1.el7.x86_64
Thanks,
david
More information about the lttng-dev
mailing list