[lttng-dev] babletrace2 graph performance considerations

Mon Mar 30 13:30:58 EDT 2020

Simon,

Thanks - see below,

On Mon, Mar 30, 2020 at 7:32 AM Simon Marchi <simark at simark.ca> wrote:

> On 2020-03-30 12:10 a.m., Rocky Dunlap via lttng-dev wrote:
> > A couple of questions on performance considerations when setting up bt2
> processing graphs.
> >
> > 1.  Do parts of the processing graph that can execute concurrently do
> so?  Does the user have any control over this, e.g., by creating threads in
> sink components?
>
> The graph execution in itself is single-threaded at the moment.  We could
> imaging
> a design where different parts of the graph execute concurrently, in
> different
> threads, but it's the the case right now.
>
> You could make your components spawn threads to do some processing on the
> side,
> if that helps, but these other threads should not interact directly with
> the
> graph.
>

In my case I have CTF trace where some analyses can be performed on a
per-stream basis (no need to mux the streams together).  In this case, I
was thinking that it would make sense to thread over the streams.  However,
I think can easily do this at a level above the graph simply by creating
multiple graphs where each one is handling a single stream.  In my case I
am thinking this will be mostly I/O bound, so I'm not sure what kind of
payoff the threads will give.  Overall, I just want to make sure that I am
not doing anything that would, in the long run, preclude
threading/concurrency if it is added to the graph model itself.

>
> > 2.  It looks like you cannot connect one output port to multiple
> inputs.  Is there a way to create a tee component?
>
> Yes, we have discussed making a tee component, it is on the roadmap but
> not really
> planned yet.  It should be possible, it's just not as trivial as it may
> sound.
>
> One easy way to achieve it is to make each iterator that is created on the
> tee
> component create and maintain its own upstream iterator.  If you have a
> tee with
> two outputs, this will effectively make it so you have two graphs
> executing in
> parallel.  If you have a src.ctf.fs source upstream of the tee, then there
> will
> be two iterators created on that source, so the CTF trace will be open and
> decoded
> twice.  We'd like to avoid that.
>
> The other way of doing it is to make the tee buffer messages, and discard a
> message once all downstream iterators have consumed it.  This has some more
> difficult technical challenges, like what to do when one downstream
> iterator
> consumes, but the other does not (we don't want to buffer an infinite
> amount
> of data).  It also makes seeking a bit tricky.
>
> We could go in more details, if you are interested in starting
> implementing it
> yourself.
>

Yea, I can see how this can get tricky.  This is not critical at this very
moment, but I just wondered if there was a precedent for how to do this
kind of thing.

>
> Simon
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20200330/e2b64238/attachment.htm>