[lttng-dev] ctf_sequence for raw data (byte stream), issues reading back with babeltrace

Fri Apr 28 11:47:28 UTC 2017

Hi,

I have decided to add raw byte stream into the tracepoint which contained zlib
compressed buffer.

I have defined the following tracepoint:

 31 TRACEPOINT_EVENT(
 32     root,
 33     root_zlib_decompress,
 34     TP_ARGS(
 35         unsigned long, input_buffer_size_arg,
 36         unsigned long, output_buffer_size_arg,
 37         unsigned long, input_adler32_arg,
 38         unsigned long, output_adler32_arg,
 39         unsigned char *, input_data_arg
 40     ),
 41     TP_FIELDS(
 42         ctf_integer_hex(unsigned long, input_adler32, input_adler32_arg)
 43         ctf_integer_hex(unsigned long, output_adler32, output_adler32_arg)
 44         ctf_integer(unsigned int, input_buffer_size, input_buffer_size_arg)
 45         ctf_integer(unsigned long, output_buffer_size, output_buffer_size_arg)
 46         ctf_sequence(unsigned char, input_data, input_data_arg, unsigned long, input_buffer_size_arg)
 47     )
 48 )

The majority of such dynamic raw byte buffers will be small (e.g. 11 bytes), but
some could got into megabytes range.

I managed to collected the data, around 1.4G in size.

I found that I couldn't view such data via babeltrace or its python bindings.
babeltrace (or python app) would eat 64G of RAM + 32G of SWAP and finally would
be killed.

From the debug output:

[..]

event {
        name = "root:root_zlib_decompress";
        id = 3;
        stream_id = 0;
        loglevel = 13;
        fields := struct {
                integer { size = 64; align = 8; signed = 0; encoding = none; base = 16; } _input_adler32;
                integer { size = 64; align = 8; signed = 0; encoding = none; base = 16; } _output_adler32;
                integer { size = 32; align = 8; signed = 0; encoding = none; base = 10; } _input_buffer_size;
                integer { size = 64; align = 8; signed = 0; encoding = none; base = 10; } _output_buffer_size;
                integer { size = 64; align = 8; signed = 0; encoding = none; base = 10; } __input_data_length;
                integer { size = 8; align = 8; signed = 0; encoding = none; base = 10; } _input_data[ __input_data_length ];
        };
};

[..]

[debug] new definition path: event.fields._input_data.[3517315]
[debug] new definition path: event.fields._input_data.[3517316]
[debug] new definition path: event.fields._input_data.[3517317]
[debug] new definition path: event.fields._input_data.[3517318]
[debug] new definition path: event.fields._input_data.[3517319]
[debug] new definition path: event.fields._input_data.[3517320]
[debug] new definition path: event.fields._inp

I also tried ctf_sequence_hex (this is what I would prefer) and
ctf_sequence_text.

ctf_sequence_hex has the same problem as ctf_sequence.

ctf_sequence_text works and I can view the data using babeltrace, but those
strings are a mess, because it's not actual string data.

Using Python binding I couldn't get to the data:

Traceback (most recent call last):
  File "test2.py", line 21, in <module>
    event['input_data'].encode()
  File "/opt/rh/python33/root/usr/lib64/python3.3/site-packages/babeltrace.py", line 862, in __getitem__
    return field.value
  File "/opt/rh/python33/root/usr/lib64/python3.3/site-packages/babeltrace.py", line 1334, in value
    value = _bt_python_get_sequence_string(self._d)
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 2-3: invalid continuation byte

Again, it's not UTF-8 encoded string.

How should I store char * data (raw byte stream) with dynamic size into event
and then later read it back.

I want to record some real-life data and use babeltrace Python bindings to
generate test cases.

I am using:
CentOS Linux release 7.3.1611 (Core)
kernel 3.10.0-514.10.2.el7.x86_64
babeltrace-1.5.2-1.el7.x86_64
python33-babeltrace-1.5.2-1.el7.x86_64
lttng-ust-devel-2.9.0-1.el7.x86_64
lttng-tools-devel-2.9.4-1.el7.x86_64
lttng-ust-java-agent-2.9.0-1.el7.x86_64
lttng-tools-2.9.4-1.el7.x86_64
lttng-ust-java-2.9.0-1.el7.x86_64
lttng-ust-2.9.0-1.el7.x86_64

Thanks,
david