[lttng-dev] [CTF2-SPEC-2.0] Announcing CTF 2, the next generation of the Common Trace Format

Philippe Proulx pproulx at efficios.com
Tue Mar 26 15:00:37 EDT 2024


To all tracing enthusiasts,

The Diagnostic and Monitoring Workgroup (DiaMon) is thrilled to announce
the launch of version 2 of the Common Trace Format (CTF), a binary trace
format designed to be fast to write without compromising great
flexibility.

The first version of CTF [1] has been widely used and tested in the
industry for almost fifteen years now, by different producers and
consumers.

CTF 2 is a major revision of CTF 1, bringing many improvements, such as:

‣ Using JSON text sequences [2] for the metadata stream.

‣ Simplifying the metadata stream.

‣ Adding new field classes (bit array, bit map, boolean, LEB128, BLOB,
  and optional) and improving existing ones.

‣ Supporting UTF-16 and UTF-32 string fields.

‣ Using roles instead of reserved structure member names to identify
  meaningful fields.

‣ Adding the attribute and extension features to extend and customize
  the format.

SPECIFICATION LINKS
━━━━━━━━━━━━━━━━━━━
The initial revision of CTF2-SPEC-2.0 is available here:

    <https://diamon.org/ctf/CTF2-SPEC-2.0.html>

Its AsciiDoc source is <https://diamon.org/ctf/CTF2-SPEC-2.0.adoc>.

The latest revision of CTF 2 is always <https://diamon.org/ctf/>.

The latest revision of CTF 1 is <https://diamon.org/ctf/v1.8.3/>.

FROM LEGACY TO FUTURE
━━━━━━━━━━━━━━━━━━━━━
The initial version of CTF [1] has been in widespread use and undergone
rigorous testing across various sectors by a range of users for close to
fifteen years.

Today, CTF 2 comes to existence to address significant limitations of
CTF 1 that make it challenging to implement a consumer and nearly
impossible to extend.

Developed over the course of eight years, CTF 2 is the culmination of
two initial proposals and nine release candidates, each one adding what
we now consider precious features that have incrementally shaped it into
the robust and versatile tracing format it is today.

We're confident that CTF 2 meets our important original design goals,
namely:

‣ CTF 2 data streams must be backward compatible with CTF 1 [1] data
  streams.

  CTF 2 only specifies metadata stream changes.

‣ The CTF 2 data streams must be highly efficient for a tracer to
  produce.

  The addition of features such as the configurable bit order of a
  fixed-length bit array field, variable-length integer fields, UTF-16
  and UTF-32 string field encodings, and BLOB fields makes this truer
  than ever.

‣ A CTF 2 metadata stream must be extensible by users of the
  specification.

  The namespaced attribute and extension mechanisms of CTF 2 metadata
  streams enable limitless extensibility.

‣ A CTF 2 trace should be easy to consume.

  A CTF 2 metadata stream is a JSON text sequence [2], removing all the
  complexity of parsing the custom DSL brought by CTF 1.

FEATURE HIGHLIGHTS
━━━━━━━━━━━━━━━━━━
CTF 2 brings many improvements over CTF 1, the most notable ones being:

JSON text sequence metadata stream:
    Whereas a CTF 1 [1] metadata stream is written in TSDL, a somewhat
    complex DSL inspired by the C language, a CTF 2 metadata stream is a
    JSON text sequence [2].

    A JSON text sequence simply is a sequence of JSON values, each one
    beginning with the "record separator (RS)" (U+001E) codepoint and
    ending with a "new line (NL)" (U+000A) codepoint.

    Using a JSON text sequence instead of a JSON array, for example,
    makes it easier to stream the objects of a CTF 2 metadata stream and
    allows the tracer to add metadata information while tracing occurs.

Attributes:
    Any trace producer may use attributes to add information to specific
    metadata stream objects, for example:

        {
          "type": "null-terminated-string",
          "encoding": "utf-16le",
          "attributes": {
            "meow-tracer": {
              "confidentiality-level": 4
            }
          }
        }

    A tracer may add namespaced attributes to trace class, data stream
    class, clock class, event record class, and field class objects.

    A general trace consumer may safely ignore attributes: you don't
    need them to decode a data stream.

Extensions:
    The purpose of an extension is to add core features to CTF 2 or to
    modify existing core features. In other words, an extension may
    alter the format itself. For example:

        {
          "type": "preamble",
          "version": 2,
          "extensions": {
            "meow-tracer": {
              "variable-clock-frequency": true
            }
          }
        }

    While a general trace consumer may safely ignore attributes, it must
    not ignore extensions.

Field roles:
    The name of a structure field member is now decoupled from any
    special meaning of said field.

    For example, you may name a packet header integer field `myosotis`
    and make it contain the current data stream ID:

        {
          "name": "myosotis",
          "field-class": {
            "type": "fixed-length-unsigned-integer",
            "byte-order: "little-endian",
            "length": 16,
            "roles": ["data-stream-id"]
          }
        }

New field classes:
    CTF 2 brings new field classes to fill some important gaps of CTF 1:

    Fixed-length bit array field:
        A compact sequence of bits without integral semantics.

        Any fixed-length scalar field conceptually is a fixed-length bit
        array field.

    Fixed-length bit map field:
        A fixed-length bit array field with flags associating bit index
        ranges to names.

    Fixed-length boolean field:
        A fixed-length bit array field of which the decoded value is
        either true or false.

    Variable-length integer field:
        An integral value encoded with the LEB128 [3] format.

    BLOB field:
        A compact, byte-aligned sequence of bytes with an associated
        media type:

            {
              "type": "static-length-blob",
              "length": 511267,
              "media-type": "image/tiff"
            }

        The length (number of bytes) of a BLOB field may be static or
        dynamic (provided by an anterior integer field).

    Optional field:
        A field which is either another field or nothing (zero bits):

            {
              "type": "optional",
              "selector-field-location": {
                "path": ["config", "has-debug-info"]
              },
              "selector-field-ranges": [[1, 255]],
              "field-class": {
                "type": "null-terminated-string"
              }
            }

        This is similar to what you could achieve with a variant field
        having an empty option in CTF 1.

All UTF string field encodings:
    A null-terminated, static-length, or dynamic-length string field may
    have the UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, or UTF-32LE encoding:

        {
          "type": "static-length-string",
          "length": 32,
          "encoding": "utf32-be"
        }

JOIN THE MOVEMENT
━━━━━━━━━━━━━━━━━
We invite you to explore the CTF 2 specification and see firsthand the
impact it can have on your tracing projects.

Even though the CTF 2 specification is now published, your feedback
remains incredibly important to us so that we can be confident that the
document is flawless and as easy to understand as possible.

As far as DiaMon projects are concerned, what we know today about CTF 2
adoption is:

Babeltrace 2 [4]:
    Planned in the next minor release, Babeltrace 2.1.

LTTng [5]:
    Planned in LTTng 2.15.

Trace Compass [6]:
    Partial support currently; awaiting more CTF 2 trace samples to
    complete the development.

If you're considering adopting CTF 2 for your own project, please share
the news with us!

REFERENCES
━━━━━━━━━━
[1]: https://diamon.org/ctf/v1.8.3/
[2]: https://datatracker.ietf.org/doc/html/rfc7464
[3]: https://en.wikipedia.org/wiki/LEB128
[4]: https://babeltrace.org/
[5]: https://lttng.org/
[6]: https://eclipse.dev/tracecompass/


More information about the lttng-dev mailing list