[lttng-dev] HugePages shared memory support in LLTng

Thu Jul 25 13:59:22 EDT 2019

On Thu, 2019-07-25 at 11:40 -0400, Mathieu Desnoyers wrote:
> There are a few reasons for using per-uid buffers over per-pid:
> 
> - Lower memory consumption for use-cases with many processes,
> - Faster process launch time: no need to allocate buffers for each process.
> Useful for use-cases with short-lived processes.
> - Keep a flight recorder "snapshot" available for all processes, including
> those which recently exited. Indeed, the per-pid buffers don't stay around
> for snapshot after a process exits or is killed.
> 
> There are however a few advantages for per-pid buffers:
> 
> - Isolation: if one PID generates corrupted trace data, it does not interfere
> with other PIDs buffers,
> - If one PID is killed between reserve and commit, it does not make that specific
> per-cpu ring buffer unusable for the rest of the tracing session lifetime.
> 
> Hoping this information helps making the right choice for your deployment!

We recently had this discussion for an embedded product that uses LTTng
to gather trace data during operation.  In our case, we want to have a flight recorder of the last X seconds of trace data, for the entire device.  X seconds times Y byte/sec data generation rate ends up being a very large portion (~30%) of the total memory available.  This has to be in RAM, using flash memory for this is not a good idea.

If we use per-PID buffers, then the buffer size needed for the largest
producer of trace data times the total number of processes is too
large: far larger than the device's memory size.  Some processes
produce trace data at a much higher rate than others.  A buffer for X
seconds of data on one processes ends up being a buffer for 10*X
seconds of data on another.  There's not enough RAM for 10*X second
buffers.

If we use per-UID buffers, then we must run everything as one UID. 
Which, on an embedded system, is not that bad, but negatively impacts
the security of the software.  Now all processes, which generate data
at different rates, can share one buffer.  Much more efficient that
having to reserve space the same space for the largest and smallest
producers.

But there ends up being another problem, the flight recorder data needs
to be saved to make use of.  To tmpfs in RAM, since the device's flash
is not suitable and used elsewhere anyway.  So one needs 2x the RAM,
one for the ring buffer and one for the trace data dump in tmpfs of the
ring buffer.

So what we did was not use flight recorder mode.  We configured lttng
to use a limited number of smaller trace files and trace file rotation.
 And used small ring buffers, which ended up not needing to be very
large to avoid overflow (I imagine saving the data to tmpfs is fast).

The trace files are in effect a per-session buffer, which is what we
want for greatest efficiency in space utilization.  And we can archive
those and download them when "something happens" without paying extra
cost for space.