<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Aptos;}
@font-face
{font-family:"Segoe UI";
panose-1:2 11 5 2 4 2 4 2 2 3;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Aptos",sans-serif;
mso-ligatures:standardcontextual;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#467886;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Aptos",sans-serif;
color:windowtext;}
span.ui-provider
{mso-style-name:ui-provider;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:11.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#467886" vlink="#96607D" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">I am trying to emit a trace that has a wchar string, using as one of the fields ctf_sequence_text. When the trace is recorded, it seems like everything at but the first character is truncated, and I think this is because it is assuming
UTF8 encoding and stopping at the first null character. This is on Ubuntu 22.04, using this liblttng-ust package:<br>
<br>
liblttng-ust-common1/jammy,now 2.13.1-1ubuntu1<br>
<br>
I’ve boiled this down to a simple repro; I’ve included the code below, but you can also get it here:
<a href="https://github.com/naricc/lttng-test">https://github.com/naricc/lttng-test</a><br>
<br>
Here is the main file:<br>
-----<br>
<br>
#include <stdio.h><o:p></o:p></p>
<p class="MsoNormal">#include <unistd.h><o:p></o:p></p>
<p class="MsoNormal">#include <lttng/lttng.h><o:p></o:p></p>
<p class="MsoNormal">#include <lttng/tracepoint.h><o:p></o:p></p>
<p class="MsoNormal">#include <wchar.h><o:p></o:p></p>
<p class="MsoNormal">#include "repro-tracepoint.h"<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">int main() {<o:p></o:p></p>
<p class="MsoNormal"> puts("Hello, World!\nPress Enter to continue...");<o:p></o:p></p>
<p class="MsoNormal"> getchar();<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> const char* utf8_text_value = "Hello, UTF8 Sequence Text!";<o:p></o:p></p>
<p class="MsoNormal"> const wchar_t *wchar_text_value = L"Hello, WChar Sequence Text!";<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> // Emit the tracepoint event with the sequence text field<o:p></o:p></p>
<p class="MsoNormal"> lttng_ust_tracepoint(naricc_test_provider, test_event, utf8_text_value, wchar_text_value);<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> return 0;<o:p></o:p></p>
<p class="MsoNormal">}<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">----<o:p></o:p></p>
<p class="MsoNormal">Here is the tracepoint header (repro-tracepoint.h):<br>
___<br>
<br>
#undef TRACEPOINT_PROVIDER<o:p></o:p></p>
<p class="MsoNormal">#define TRACEPOINT_PROVIDER naricc_test_provider<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">#undef TRACEPOINT_INCLUDE<o:p></o:p></p>
<p class="MsoNormal">#define TRACEPOINT_INCLUDE "./repro-tracepoint.h"<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">#if !defined(_TP_H) || defined(TRACEPOINT_HEADER_MULTI_READ)<o:p></o:p></p>
<p class="MsoNormal">#define _TP_H<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">#include <lttng/tracepoint.h><o:p></o:p></p>
<p class="MsoNormal">#include <wchar.h><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">// Define the tracepoint event with a sequence text field<o:p></o:p></p>
<p class="MsoNormal">TRACEPOINT_EVENT(naricc_test_provider, test_event,<o:p></o:p></p>
<p class="MsoNormal"> TP_ARGS(<o:p></o:p></p>
<p class="MsoNormal"> const char*, utf8_text_value,<o:p></o:p></p>
<p class="MsoNormal"> const wchar_t*, wchar_text_value<o:p></o:p></p>
<p class="MsoNormal"> ),<o:p></o:p></p>
<p class="MsoNormal"> TP_FIELDS(<o:p></o:p></p>
<p class="MsoNormal"> ctf_sequence_text(char, utf8_text_sequence, utf8_text_value, size_t, strlen(utf8_text_value))<o:p></o:p></p>
<p class="MsoNormal"> ctf_sequence_text(wchar_t, wchar_text_sequence, wchar_text_value, size_t, wcslen(wchar_text_value) * 2 + 2)<o:p></o:p></p>
<p class="MsoNormal"> )<o:p></o:p></p>
<p class="MsoNormal">)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">#endif /* _TP_H */<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">#include <lttng/tracepoint-event.h><br>
<br>
----<br>
And here is the repro-tracepoint.cpp:<br>
___<br>
<br>
#define LTTNG_UST_TRACEPOINT_CREATE_PROBES<o:p></o:p></p>
<p class="MsoNormal">#define LTTNG_UST_TRACEPOINT_DEFINE<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">#include "repro-tracepoint.h"<br>
<br>
---<br>
<br>
I built it like so:<br>
<br>
<br>
g++ -c -I. repro-tracepoint.cpp<o:p></o:p></p>
<p class="MsoNormal">g++ -c lttng-test.cpp<o:p></o:p></p>
<p class="MsoNormal">g++ -o lttng-test lttng-test.o repro-tracepoint.o -llttng-ust -ldl<o:p></o:p></p>
<p style="margin:0in">naricc@TDC20748914:/workspace/lttng-test$<br>
<br>
---<br>
<br>
After starting a session, running that program, and destroying the session, this is what I get with babeltrace2:<br>
<br>
```<br>
<span class="ui-provider">$ babeltrace2 ~/lttng-traces/my-user-space-session-20240131-161638</span><br>
<span class="ui-provider">[16:16:43.553211421] (+?.?????????) TDC20748914 naricc_test_provider:test_event: { cpu_id = 6 }, { _utf8_text_sequence_length = 26, utf8_text_sequence = "Hello, UTF8 Sequence Text!", _wchar_text_sequence_length = 56, wchar_text_sequence
= [ [0] = 72, [1] = 0, [2] = 0, [3] = 0, [4] = 0, [5] = 0, [6] = 0, [7] = 0, [8] = 0, [9] = 0, [10] = 0, [11] = 0, [12] = 0, [13] = 0, [14] = 0, [15] = 0, [16] = 0, [17] = 0, [18] = 0, [19] = 0, [20] = 0, [21] = 0, [22] = 0, [23] = 0, [24] = 0, [25] = 0, [26]
= 0, [27] = 0, [28] = 0, [29] = 0, [30] = 0, [31] = 0, [32] = 0, [33] = 0, [34] = 0, [35] = 0, [36] = 0, [37] = 0, [38] = 0, [39] = 0, [40] = 0, [41] = 0, [42] = 0, [43] = 0, [44] = 0, [45] = 0, [46] = 0, [47] = 0, [48] = 0, [49] = 0, [50] = 0, [51] = 0, [52]
= 0, [53] = 0, [54] = 0, [55] = 0 ] }</span><br>
<br>
```<br>
<br>
The utf8 sequence prints fine, but the wchar one is truncated to a single character and then zeros. To rule out an error in babelltrace, I inspected the channel files with hexedit and found this:<br>
<br>
```<span style="font-size:10.5pt;font-family:"Segoe UI",sans-serif"><br>
85 58 E1 C1 43 EE 9A 93 9C 60 6D B7 2B B0 00 00 00 00 .......X..C....`m.+.....<br>
00000018 06 00 00 00 00 00 00 00 6A 33 DA B4 2D AD 02 00 04 78 1C 56 30 AD 02 00 ........j3..-....x.V0...<br>
00000030 60 0B 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 `.......................<br>
00000048 00 00 00 00 00 00 00 00 06 00 00 00 FF FF 00 00 00 00 3C 10 F2 E2 2E AD ..................<.....<br>
00000060 02 00 1A 00 00 00 00 00 00 00 48 65 6C 6C 6F 2C 20 55 54 46 38 20 53 65 ..........Hello, UTF8 Se<br>
00000078 71 75 65 6E 63 65 20 54 65 78 74 21 38 00 00 00 00 00 00 00 48 00 00 00 quence Text!8.......H...<br>
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........................<o:p></o:p></span></p>
<p class="MsoNormal">```<br>
<br>
So it seems the error in the the recording of the trace, not in the viewing.<br>
<br>
Looking into the lttng-ust code, it seems like ctf_sequence_text ends up mapped to this:<br>
<br>
<a href="https://github.com/lttng/lttng-ust/blob/717c38f658248bc04ccfc6e7fdf5d03040c2a846/include/lttng/ust-tracepoint-event-write.h#L73">lttng-ust/include/lttng/ust-tracepoint-event-write.h at 717c38f658248bc04ccfc6e7fdf5d03040c2a846 · lttng/lttng-ust · GitHub</a><br>
<br>
Which assumes utf8 encoding, and ultimately writes into a ring buffer terminating on null:<br>
<br>
<a href="https://github.com/lttng/lttng-ust/blob/717c38f658248bc04ccfc6e7fdf5d03040c2a846/src/common/ringbuffer/backend.h#L126"><span style="font-size:10.5pt;font-family:"Segoe UI",sans-serif">lttng-ust/src/common/ringbuffer/backend.h at 717c38f658248bc04ccfc6e7fdf5d03040c2a846
· lttng/lttng-ust · GitHub</span></a><br>
<br>
<br>
If we agree this is an error, I believe I can produce a fix for it. Or if I am just using the APIs wrong, please let me know what I should do instead.<br>
<br>
--Nathan Ricci<br>
<br>
<br>
<br>
<br>
<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</body>
</html>