RFC - LTTng Network Session and Snapshot Author: David Goulet Contributors: * Mathieu Desnoyers * Julien Desfossez Version: - v0.1: 16/04/2012 * Initial proposal - v0.2: 04/05/2012 * Add snapshot description * Propose a new API and lttng cli command Introduction ----------------- This RFC proposes a way for the lttng 2.0 session daemon to handle network session for streaming purposes and eventually remote control. The next sections introduce the concept of network session and how it is envisioned in the lttng 2.0 toolchain. Please note that this RFC is neither final nor complete without the community feedbacks. The text below is a proposal. Snapshot ----------------- For version 2.1 of lttng will support snapshots meaning that at any moment a user can trigger a command to snapshot the current tracing buffers and store them. The snapshot feature will use the current consumer attached to the tracing session or create one if none is available. We allow to pass a full URI to the snapshot command to override the current consumer or define one. We will support the following protocols: * net://HOST[:PORT_CTRL[:PORT_DATA]] * tcp://HOST:PORT * tcp6://HOST:PORT * udp://HOST:PORT * udp6://HOST:PORT * file:// The net:// URI scheme will make the control and data path use the default transport protocol which are TCP for both channels. The same remote host is also used for both. The ports can be specified and if not the defaults are used which are 5342 for control and 5343 for data. If URI not recognized, we use the arguments as a file name. The control and data channel are two separate arguments of the API since we allow the user to control the protocol and path (address). However, for a transfer to succeed, the lttng-sessiond and the remote end must establish a session for the control _and_ data path. If one fails to do so, the procedure is aborted. Thus, a different address for the control path from the data path is allowed but the user has to make sure that both channels end up at the same physical destination. Note that the control path is a crucial and high priority channel of communication so for now we only allow it to use the TCP protocol. Upon a buffer snapshot, the metadata of the session has to be asked again to the tracer so we can dump it since tracing might have started long ago. Session with Network Transport ----------------- In order to tell the session daemon where to send the data for streaming, a tracing session has to be aware of some information of the remote target. * Remote end network address (Ex: IP or Hostname) * Destination control port * Destination data port Streaming can be initiated by telling the session daemon that a specific session is set for network streaming. This will make the session daemon establish a connection with the remote end. Once tracing starts, the local consumer will be made aware of this information and will start sending data following a strict protocol defined in the streaming RFC written by Julien Desfossez. Finally, a trace received by a network consumer will have a new "namespace" prepended to the trace output directory hierarchy: the hostname from _where_ the trace is coming from. host01 \-- my_session1 \-- ust \-- my_app1[...] \-- trace data... \-- kernel \-- trace data... Client API integration ----------------- Adding an API call to set attributes such as network information to a session. Since lttng_create_session only takes a name and a path, a new call is required to pass this information. The naming convention is NOT final and can be improved. struct lttng_handle handle; enum lttng_dst_type { LTTNG_DST_IPV4, LTTNG_DST_IPV6, LTTNG_DST_HOST, LTTNG_DST_PATH, }; enum lttng_uri_type { LTTNG_URI_HOP, LTTNG_URI_DST, }; enum lttng_stream_type { LTTNG_STREAM_CONTROL, LTTNG_STREAM_DATA }; enum lttng_proto { LTTNG_UDP, LTTNG_TCP, LTTNG_UDP6, LTTNG_TCP6, LTTNG_FILE, }; #define LTTNG_NETWORK_PADDING1_LEN 32 #define LTTNG_NETWORK_PADDING2_LEN 128 struct lttng_uri { enum dst_type dtype; enum uri_type utype; enum stream_type stype; enum proto protocol; in_port_t port; char padding[LTTNG_NETWORK_PADDING1_LEN]; union { char ipv4[INET_ADDRSTRLEN]; char ipv6[INET6_ADDRSTRLEN]; char hostname[HOST_NAME_MAX]; char path[PATH_NAME]; char padding[LTTNG_NETWORK_PADDING2_LEN]; } dst; }; For snapshots: /* Create a snapshot template on the session daemon */ lttng_create_snapshot(handle); /* Set URI for the created snapshot */ lttng_set_snapshot_uri(handle, struct lttng_uri *u); /* Execute snapshot */ lttng_snapshot(handle); For consumers: /* Create a consumer template on the session daemon side. */ lttng_create_consumer(handle); /* Set URI in the consumer template*/ lttng_set_consumer_uri(handle, struct lttng_uri *u); /* * Enable consumer template for the session. Once enabled, no more URI setting * are possible. */ lttng_enable_consumer(handle); /* * Disable the consumer means that the consumer will stop consuming but will * still be exist. Executing the enable_consumer call again will simply re * enable it. */ lttng_disable_consumer(handle); If lttng_create_consumer is executed on a session which already has a consumer attached to it, the present consumer is freed and a new template is added. We propose to add three commands to the lttng command line actions: i) lttng enable-consumer [FILE | URI] -s SESSION_NAME -c, --control-uri=[HOP1,]URI -d, --data-uri=[HOP1,]URI ii) lttng disable-consumer -s SESSION_NAME iii) lttng snapshot [FILE | URI] -s SESSION_NAME -c, --control-uri=[HOP1,]URI -d, --data-uri=[HOP1,]URI Each option defining URI(s) can contains a list of hops preceeding the final destination. However, the proxy feature is still not supported but we prefer to inform the community of is future existence. So, the regular chain of command to enable network streaming for example would be: # lttng create session1 # lttng enable-event -a -k // The next command sets the destination host but uses the default protocols and // ports. # lttng enable-consumer net://192.168.1.10 # lttng start (tracing...) # lttng stop The snapshot use case saved in a user define location on the disk: # lttng create session1 # lttng enable-event -a -k # lttng disable-consumer /* So we do not collect data with the default consumer */ # lttng start (tracing ...) # lttng snapshot /tmp/output Session daemon integration ----------------- As mentioned earlier, the session daemon will be in charge of establishing a streaming session with the target over the network i.e. creating the control and data path bidirectional socket. Once done, a network consumer is spawned and those sockets are passed over. From there, the session daemon can interact with the consumer by stopping the network streaming or re-establishing a local trace collection with a non network consumer.