[lttng-dev] Perf ABI (was: Re: [PATCH 09/11] sched: export task_prio to GPL modules)

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Thu Jan 12 09:09:06 EST 2012


* Ted Ts'o (tytso at mit.edu) wrote:
> On Fri, Dec 23, 2011 at 01:16:41PM -0500, Mathieu Desnoyers wrote:
[...]
> > - The trace data format
> >   - Both versioned _and_ self-described.
> >   Self-description of the event/field layout allows the same tools to
> >   understand traces gathered on different kernel versions, on different
> >   architectures, with different tracer configurations.
> >   Versioning on top of the self-described trace format allows changes
> >   to what the trace self-description can express.
> 
> So there are two ways to do this.  One is to make changes be backwards
> compatible, so that the trace data format only breaks if you use the
> new feature; if it doesn't you encode things the old fashioned way.
> The other way of doing things is to randomly break users whenever the
> tracing developers decide to add some random new feature, regardless
> of whether or not a partiuclar user finds that new feature to be
> useful.
> 
> The first is acceptable.  The second, IMHO, is not.  Linus has said
> quite strongly that WE DO NOT BREAK USERSPACE.   Period.

Please allow me to look into what needs to be kept compatible for a good
user experience (for both Linux end users and kernel developers) in the
case of tracing:

Let's first describe what we really utterly don't want: random breakages
between the kernel and user-level tracing control/transport/analysis
tools. Consequently, I think we could say that it would be unacceptable
for userspace tools to break for every slight change of kernel code. If
that would be the case (as it was with the approach SystemTap was taking
before they started hooking into the kernel with tracepoints), then we'd
need to regenerate the tools for pretty much every -rc kernel, and for
each local development tree, which would make those tools useless to
kernel developers.

It is important to clarify that tracing is, in my opinion, not part of
the runtime support, which makes it very different by nature from
filesystems and kernel runtime support. So I agree with Linus' argument
about not breaking userspace when applied to runtime support, because
being unable to even boot a system due to an ABI breakage is very much
unwanted. However, I think it should not be applied as-is to tracing,
because you cannot make a system unusable due to a tracer ABI breakage:
if a tracer can be packaged in a set of standalone modules, that clearly
shows it is not part of the system runtime support.

That being said, ABI versioning could still handle ABI changes without
significantly impacting the users: when an ABI breakage is needed, we
can keep the old code around for a while and expose both the old and new
ABIs. This would ensure that the user-level tools can query for the
specific ABI major version(s) they support. That should improve the user
experience by providing "deprecated" console warnings for a few kernel
releases before the old code ends up being removed.

So, in summary:

  * Old kernels vs new tools:

New tools can query for the latest ABI they know, and fall-back on older
ABIs, with limited features.

  * New kernels vs old tools:

Keeping around the old ABI for a deprecation phase lets old tools work on
a bleeding edge kernel while the ABI change is being introduced, which
should satisfy the kernel developer use-case.

Best regards,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list