[lttng-dev] Golang agent for LTTng-ust

Jérémie Galarneau jeremie.galarneau at efficios.com
Tue May 29 12:21:41 EDT 2018


On 29 May 2018 at 10:32, Loïc Gelle <loic.gelle at polymtl.ca> wrote:

> I don't understand what you mean by "not sustainable".
>>
>
> Not something you can keep in your codebase forever.
>

I think we're talking past each other on this point. Beyond being "not
elegant", I am not aware of Go dropping the support for calling C code.

I am not seeing how this isn't sustainable in the literal sense.


>
> It really depends on who does it I would guess. It's very probably a couple
>> of months to get something that is bullet-proof and mergeable.
>>
>
> Is it on the roadmap for a future version of LTTng?
>

It's something we would like to support, but it isn't on the roadmap in the
short term, at least. This is not to say it can't be worked on.

Jérémie


>
> Jérémie Galarneau <jeremie.galarneau at efficios.com> a écrit :
>
> On 29 May 2018 at 09:47, Loïc Gelle <loic.gelle at polymtl.ca> wrote:
>>
>> I agree that integrating C code into a Go codebase is somewhat inelegant.
>>>
>>>>
>>>>
>>> Not only that, but it's not sustainable. It is more a hack than a
>>> feature.
>>>
>>>
>>
>>
>>
>>> However, I'm not sure what you mean by "implementation issues that are
>>>
>>>> specific to the language itself".
>>>>
>>>>
>>> I mean that if you put static calls to C tracepoints from a Go program,
>>> you always have a function call (and the ~50ns overhead) triggered each
>>> time you hit the tracepoint, whether the tracepoint is actually enabled
>>> or
>>> not. So basically you can't count on the compiler (specific to the
>>> language) to do clever branch prediction for you, which reduces the
>>> interest of instrumenting your code.
>>>
>>>
>> In the case of the first solution, note that the agent tracks which events
>> are enabled or not. In that sense, the check is performed within the Go
>> code, as is currently done for Python and Java.
>>
>>
>>
>>> I like the third solution that you propose. I think that the first one is
>>> definitely not ideal and that the second one is too much work and
>>> maintaining. How much time do you estimate is necessary for the
>>> development
>>> of the third solution?
>>>
>>>
>>
>>
>>
>>> By the way, I am currently working on instrumenting the Go runtime to
>>> capture information on the goroutines. I am using dyntrace (
>>> https://github.com/charpercyr/dyntrace) for that, which kind of works
>>> but
>>> is really hacky.
>>>
>>>
>> That sounds great. What kind of information are you capturing?
>>
>> Thanks,
>> Jérémie
>>
>>
>>
>>> Jérémie Galarneau <jeremie.galarneau at efficios.com> a écrit :
>>>
>>> On 28 May 2018 at 10:30, Loïc Gelle <loic.gelle at polymtl.ca> wrote:
>>>
>>>>
>>>> Hi Jeremie,
>>>>
>>>>>
>>>>> Thanks for your answer. I roughly estimated the overhead of calling an
>>>>> empty C function (passing two integer arguments) from Go to 50ns per
>>>>> call.
>>>>> Maybe not a big deal for a lot of use cases, but more problematic if
>>>>> you
>>>>> want to trace performance-critical parts of Go like its runtime itself.
>>>>> The
>>>>> overhead could even be bigger when it involves passing strings or
>>>>> arrays
>>>>> that have different memory layouts in Golang and C. What was the
>>>>> overhead
>>>>> that you observed for Python and Java?
>>>>>
>>>>>
>>>>> 50ns per call doesn't sound too bad honestly.
>>>>
>>>> You have to ask yourself if you could get within 50ns of lttng-ust's
>>>> performance with a custom ring buffer implemented in Go.
>>>>
>>>> To use some very rough numbers, lttng-ust for that payload, takes around
>>>> ~250ns per event. With Mathieu's work on restartable sequence, that
>>>> number
>>>> will be shaved off quite a bit (by half, if I remember correctly), and
>>>> I'm
>>>> not sure you'll be able to use that kind of mechanism from Go code.
>>>>
>>>> I don't have numbers on hand for Python and Java. In both cases, we are
>>>> hooking into logging frameworks so the overhead of calling into C code
>>>> probably pales in comparison to the time spent formatting strings.
>>>> That's another problem in using the current "agent" mechanism; it really
>>>> only accommodates a very specific tracepoint signature that takes a
>>>> string
>>>> payload.
>>>>
>>>>
>>>>
>>>> From what I understand, it will always be a problem to have agents for
>>>>> languages different than C, especially if you want to keep relying on
>>>>> existing C code. Even if the sessiond part is independant from the
>>>>> agent
>>>>> itself, there are tons of implementation issues that are specific to
>>>>> the
>>>>> language itself. The problem with Go is that calling C functions is
>>>>> really
>>>>> a hack that does not integrate well with the build system that was
>>>>> designed
>>>>> for Go.
>>>>>
>>>>>
>>>>>
>>>> The solutions I see:
>>>>
>>>> 1) Replicate the current "agent" scheme and serialize all Go events to
>>>> strings
>>>>
>>>> Not ideal as you lose the events' typing, you have to serialize to
>>>> strings
>>>> on the fast path, and you can hardly filter on event payloads.
>>>>
>>>> 2) Write a native Go ring-buffer that can be consumed by LTTng
>>>>
>>>> In essence, all the tracing would happen in Go. Events would be
>>>> serialized
>>>> by Go code and the Go "agent" would produce the CTF metadata that
>>>> describes
>>>> their layout.
>>>>
>>>> From an integration standpoint, that's probably the most elegant
>>>> solution
>>>> as you have no hard dependency on native code in your go projects.
>>>> However,
>>>> it's a _lot_ of work.
>>>>
>>>> First, you have to re-implement a ring-buffer that needs to perform
>>>> within
>>>> 50ns of lttng-ust's ring-buffer to be useful. You also need to port the
>>>> event filtering bytecode interpreter to Go.
>>>> Then, we need to find a way to consume that ring-buffer's content from a
>>>> form of consumer daemon within lttng-tools.
>>>>
>>>> 3) Add an lttng-ust API to allow dynamic event declaration
>>>>
>>>> This is something we have been considering for a while.
>>>>
>>>> Basically, we would like to introduce an API that allows applications to
>>>> dynamically declare tracepoints.
>>>> Then, those events would be serialized from Go, but the ring-buffer
>>>> logic
>>>> would remain in C.
>>>>
>>>> On each event, we would:
>>>>   - Obtain a memory area from lttng-ust (reserve phase, C code called
>>>> from
>>>> Go)
>>>>   - Write the event's content to that area (from Go code)
>>>>   - Commit the event (C code called from Go)
>>>>
>>>> With this, you don't have to manually declare tracepoints and integrate
>>>> them into a build system to generate providers; the Go application just
>>>> needs to link to lttng-ust at runtime.
>>>> It's not a perfect solution, but it seems like an interesting
>>>> compromise.
>>>>
>>>>
>>>> What do you think?
>>>>
>>>> Jérémie
>>>>
>>>>
>>>>
>>>>
>>>> Did I provide more context?
>>>>>
>>>>> Cheers,
>>>>> Loïc.
>>>>>
>>>>> Jérémie Galarneau <jeremie.galarneau at efficios.com> a écrit :
>>>>>
>>>>> On 4 May 2018 at 06:03, Loïc Gelle <loic.gelle at polymtl.ca> wrote:
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>>> There has been a previous discussion on the mailing list about
>>>>>>> porting
>>>>>>> LTTng to Golang, about a year ago: https://lists.lttng.org/
>>>>>>> pipermail/lttng-dev/2017-June/027203.html . This new topic is to
>>>>>>> discuss
>>>>>>>
>>>>>>> more precisely about implementation possibilities.
>>>>>>>
>>>>>>> Currently, one has to use the the C UST agent from LTTng in order to
>>>>>>> instrument Golang programs, and to compile the whole thing using
>>>>>>> custom
>>>>>>> Makefiles and cgo. Here is a recent example that I wrote:
>>>>>>> https://github.com/loicgelle/jaeger-go-lttng-instr
>>>>>>>
>>>>>>> As you can guess, there are a low of drawbacks in that approach. It
>>>>>>> is
>>>>>>> actually a hack and cannot be integrated into more complex Golang
>>>>>>> program
>>>>>>> that use a more complex build process (e.g. the Golang runtime
>>>>>>> itself),
>>>>>>> because of the compiler instructions that you have to include at the
>>>>>>> top
>>>>>>> of
>>>>>>> the Golang files. There is also a big concern about the performance
>>>>>>> of
>>>>>>> this
>>>>>>> solution, as calling a C function from Go requires to do a full
>>>>>>> context
>>>>>>> switch on the stack, because the calling conventions in C and Golang
>>>>>>> are
>>>>>>> different.
>>>>>>>
>>>>>>>
>>>>>>> I think a more integrated and performant solution is needed. We can’t
>>>>>>>
>>>>>>
>>>>>> really ignore a language such as Golang that is now widely adopted for
>>>>>>> cloud applications. LTTng is really the best solution out there in
>>>>>>> terms
>>>>>>> of
>>>>>>> overhead per tracepoint, and could benefit from being made available
>>>>>>> to
>>>>>>> such a large community. My question to the experts on this mailing
>>>>>>> list:
>>>>>>> how much would it take to write a Golang agent for LTTng?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Loïc,
>>>>>>
>>>>>> Without having performed any measurements myself, it does seem like
>>>>>> calling
>>>>>> C from Go is very expensive. In that context, I can see that LTTng
>>>>>> would
>>>>>> probably lose its performance advantage over any native Go solution.
>>>>>> However, it wouldn't hurt to measure the impact and see if it really
>>>>>> is
>>>>>> a
>>>>>> deal breaker.
>>>>>>
>>>>>> We faced the same dilemma when implementing the Java and Python
>>>>>> support
>>>>>> in
>>>>>> lttng-ust. In those cases, we ended up calling C code, with the
>>>>>> performance
>>>>>> penalties it implies. The correlation with other applications' and the
>>>>>> kernel's events, along with the rest of LTTng's features, provided
>>>>>> enough
>>>>>> value to make that solution worthwhile.
>>>>>>
>>>>>> There aren't a ton of solutions if we can't call existing C code. We
>>>>>> basically have to reimplement a ring-buffer and the
>>>>>> setup/communication
>>>>>> infrastructure to interact with the lttng-sessiond. The communication
>>>>>> with
>>>>>> the session daemon is not a big concern as the protocol is fairly
>>>>>> straightforward.
>>>>>>
>>>>>> The "hairy" part is that lttng-ust and lttng-consumerd use a shared
>>>>>> memory
>>>>>> map to produce and consume the tracing buffers. This means that all
>>>>>> changes
>>>>>> to that memory layout would need to be replicated in the Go tracer,
>>>>>> making
>>>>>> future evolution more difficult. Also, I don't know how easy it would
>>>>>> be
>>>>>> to
>>>>>> synchronize C and Go applications interacting in a shared memory map
>>>>>> given
>>>>>> those languages have different memory models. My knowledge of Go
>>>>>> doesn't
>>>>>> go
>>>>>> that far.
>>>>>>
>>>>>> A more viable solution could be to introduce a Go-native consumer
>>>>>> daemon
>>>>>> implementing its own synchronization with Go applications. This way,
>>>>>> that
>>>>>> implementation could evolve on its own and could also start with a
>>>>>> simpler
>>>>>> ring buffer than lttng-ust's.
>>>>>>
>>>>>> Still, it is not a small undertaking and it basically means
>>>>>> maintaining
>>>>>> a
>>>>>> third tracer implementation.
>>>>>>
>>>>>>
>>>>>> What do you think?
>>>>>>
>>>>>> Thanks!
>>>>>> Jérémie
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Loïc.
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> lttng-dev mailing list
>>>>>>> lttng-dev at lists.lttng.org
>>>>>>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>> Jérémie Galarneau
>>>>>> EfficiOS Inc.
>>>>>> http://www.efficios.com
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> --
>>>> Jérémie Galarneau
>>>> EfficiOS Inc.
>>>> http://www.efficios.com
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Jérémie Galarneau
>> EfficiOS Inc.
>> http://www.efficios.com
>>
>
>
>
>


-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20180529/df334f3a/attachment-0001.html>


More information about the lttng-dev mailing list