[lttng-dev] Golang agent for LTTng-ust

Tue May 29 10:32:49 EDT 2018

> I don't understand what you mean by "not sustainable".

Not something you can keep in your codebase forever.

> It really depends on who does it I would guess. It's very probably a couple
> of months to get something that is bullet-proof and mergeable.

Is it on the roadmap for a future version of LTTng?

Jérémie Galarneau <jeremie.galarneau at efficios.com> a écrit :

> On 29 May 2018 at 09:47, Loïc Gelle <loic.gelle at polymtl.ca> wrote:
>
>> I agree that integrating C code into a Go codebase is somewhat inelegant.
>>>
>>
>> Not only that, but it's not sustainable. It is more a hack than a feature.
>>
>
>
>
>>
>> However, I'm not sure what you mean by "implementation issues that are
>>> specific to the language itself".
>>>
>>
>> I mean that if you put static calls to C tracepoints from a Go program,
>> you always have a function call (and the ~50ns overhead) triggered each
>> time you hit the tracepoint, whether the tracepoint is actually enabled or
>> not. So basically you can't count on the compiler (specific to the
>> language) to do clever branch prediction for you, which reduces the
>> interest of instrumenting your code.
>>
>
> In the case of the first solution, note that the agent tracks which events
> are enabled or not. In that sense, the check is performed within the Go
> code, as is currently done for Python and Java.
>
>
>>
>> I like the third solution that you propose. I think that the first one is
>> definitely not ideal and that the second one is too much work and
>> maintaining. How much time do you estimate is necessary for the development
>> of the third solution?
>>
>
>
>
>>
>> By the way, I am currently working on instrumenting the Go runtime to
>> capture information on the goroutines. I am using dyntrace (
>> https://github.com/charpercyr/dyntrace) for that, which kind of works but
>> is really hacky.
>>
>
> That sounds great. What kind of information are you capturing?
>
> Thanks,
> Jérémie
>
>
>>
>> Jérémie Galarneau <jeremie.galarneau at efficios.com> a écrit :
>>
>> On 28 May 2018 at 10:30, Loïc Gelle <loic.gelle at polymtl.ca> wrote:
>>>
>>> Hi Jeremie,
>>>>
>>>> Thanks for your answer. I roughly estimated the overhead of calling an
>>>> empty C function (passing two integer arguments) from Go to 50ns per
>>>> call.
>>>> Maybe not a big deal for a lot of use cases, but more problematic if you
>>>> want to trace performance-critical parts of Go like its runtime itself.
>>>> The
>>>> overhead could even be bigger when it involves passing strings or arrays
>>>> that have different memory layouts in Golang and C. What was the overhead
>>>> that you observed for Python and Java?
>>>>
>>>>
>>> 50ns per call doesn't sound too bad honestly.
>>>
>>> You have to ask yourself if you could get within 50ns of lttng-ust's
>>> performance with a custom ring buffer implemented in Go.
>>>
>>> To use some very rough numbers, lttng-ust for that payload, takes around
>>> ~250ns per event. With Mathieu's work on restartable sequence, that number
>>> will be shaved off quite a bit (by half, if I remember correctly), and I'm
>>> not sure you'll be able to use that kind of mechanism from Go code.
>>>
>>> I don't have numbers on hand for Python and Java. In both cases, we are
>>> hooking into logging frameworks so the overhead of calling into C code
>>> probably pales in comparison to the time spent formatting strings.
>>> That's another problem in using the current "agent" mechanism; it really
>>> only accommodates a very specific tracepoint signature that takes a string
>>> payload.
>>>
>>>
>>>
>>>> From what I understand, it will always be a problem to have agents for
>>>> languages different than C, especially if you want to keep relying on
>>>> existing C code. Even if the sessiond part is independant from the agent
>>>> itself, there are tons of implementation issues that are specific to the
>>>> language itself. The problem with Go is that calling C functions is
>>>> really
>>>> a hack that does not integrate well with the build system that was
>>>> designed
>>>> for Go.
>>>>
>>>>
>>>
>>> The solutions I see:
>>>
>>> 1) Replicate the current "agent" scheme and serialize all Go events to
>>> strings
>>>
>>> Not ideal as you lose the events' typing, you have to serialize to strings
>>> on the fast path, and you can hardly filter on event payloads.
>>>
>>> 2) Write a native Go ring-buffer that can be consumed by LTTng
>>>
>>> In essence, all the tracing would happen in Go. Events would be serialized
>>> by Go code and the Go "agent" would produce the CTF metadata that
>>> describes
>>> their layout.
>>>
>>> From an integration standpoint, that's probably the most elegant solution
>>> as you have no hard dependency on native code in your go projects.
>>> However,
>>> it's a _lot_ of work.
>>>
>>> First, you have to re-implement a ring-buffer that needs to perform within
>>> 50ns of lttng-ust's ring-buffer to be useful. You also need to port the
>>> event filtering bytecode interpreter to Go.
>>> Then, we need to find a way to consume that ring-buffer's content from a
>>> form of consumer daemon within lttng-tools.
>>>
>>> 3) Add an lttng-ust API to allow dynamic event declaration
>>>
>>> This is something we have been considering for a while.
>>>
>>> Basically, we would like to introduce an API that allows applications to
>>> dynamically declare tracepoints.
>>> Then, those events would be serialized from Go, but the ring-buffer logic
>>> would remain in C.
>>>
>>> On each event, we would:
>>>   - Obtain a memory area from lttng-ust (reserve phase, C code called from
>>> Go)
>>>   - Write the event's content to that area (from Go code)
>>>   - Commit the event (C code called from Go)
>>>
>>> With this, you don't have to manually declare tracepoints and integrate
>>> them into a build system to generate providers; the Go application just
>>> needs to link to lttng-ust at runtime.
>>> It's not a perfect solution, but it seems like an interesting compromise.
>>>
>>>
>>> What do you think?
>>>
>>> Jérémie
>>>
>>>
>>>
>>>
>>>> Did I provide more context?
>>>>
>>>> Cheers,
>>>> Loïc.
>>>>
>>>> Jérémie Galarneau <jeremie.galarneau at efficios.com> a écrit :
>>>>
>>>> On 4 May 2018 at 06:03, Loïc Gelle <loic.gelle at polymtl.ca> wrote:
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>>>
>>>>>> There has been a previous discussion on the mailing list about porting
>>>>>> LTTng to Golang, about a year ago: https://lists.lttng.org/
>>>>>> pipermail/lttng-dev/2017-June/027203.html . This new topic is to
>>>>>> discuss
>>>>>>
>>>>>> more precisely about implementation possibilities.
>>>>>>
>>>>>> Currently, one has to use the the C UST agent from LTTng in order to
>>>>>> instrument Golang programs, and to compile the whole thing using custom
>>>>>> Makefiles and cgo. Here is a recent example that I wrote:
>>>>>> https://github.com/loicgelle/jaeger-go-lttng-instr
>>>>>>
>>>>>> As you can guess, there are a low of drawbacks in that approach. It is
>>>>>> actually a hack and cannot be integrated into more complex Golang
>>>>>> program
>>>>>> that use a more complex build process (e.g. the Golang runtime itself),
>>>>>> because of the compiler instructions that you have to include at the
>>>>>> top
>>>>>> of
>>>>>> the Golang files. There is also a big concern about the performance of
>>>>>> this
>>>>>> solution, as calling a C function from Go requires to do a full context
>>>>>> switch on the stack, because the calling conventions in C and Golang
>>>>>> are
>>>>>> different.
>>>>>>
>>>>>>
>>>>>> I think a more integrated and performant solution is needed. We can’t
>>>>>
>>>>>> really ignore a language such as Golang that is now widely adopted for
>>>>>> cloud applications. LTTng is really the best solution out there in
>>>>>> terms
>>>>>> of
>>>>>> overhead per tracepoint, and could benefit from being made available to
>>>>>> such a large community. My question to the experts on this mailing
>>>>>> list:
>>>>>> how much would it take to write a Golang agent for LTTng?
>>>>>>
>>>>>>
>>>>>>
>>>>> Hi Loïc,
>>>>>
>>>>> Without having performed any measurements myself, it does seem like
>>>>> calling
>>>>> C from Go is very expensive. In that context, I can see that LTTng would
>>>>> probably lose its performance advantage over any native Go solution.
>>>>> However, it wouldn't hurt to measure the impact and see if it really is
>>>>> a
>>>>> deal breaker.
>>>>>
>>>>> We faced the same dilemma when implementing the Java and Python support
>>>>> in
>>>>> lttng-ust. In those cases, we ended up calling C code, with the
>>>>> performance
>>>>> penalties it implies. The correlation with other applications' and the
>>>>> kernel's events, along with the rest of LTTng's features, provided
>>>>> enough
>>>>> value to make that solution worthwhile.
>>>>>
>>>>> There aren't a ton of solutions if we can't call existing C code. We
>>>>> basically have to reimplement a ring-buffer and the setup/communication
>>>>> infrastructure to interact with the lttng-sessiond. The communication
>>>>> with
>>>>> the session daemon is not a big concern as the protocol is fairly
>>>>> straightforward.
>>>>>
>>>>> The "hairy" part is that lttng-ust and lttng-consumerd use a shared
>>>>> memory
>>>>> map to produce and consume the tracing buffers. This means that all
>>>>> changes
>>>>> to that memory layout would need to be replicated in the Go tracer,
>>>>> making
>>>>> future evolution more difficult. Also, I don't know how easy it would be
>>>>> to
>>>>> synchronize C and Go applications interacting in a shared memory map
>>>>> given
>>>>> those languages have different memory models. My knowledge of Go doesn't
>>>>> go
>>>>> that far.
>>>>>
>>>>> A more viable solution could be to introduce a Go-native consumer daemon
>>>>> implementing its own synchronization with Go applications. This way,
>>>>> that
>>>>> implementation could evolve on its own and could also start with a
>>>>> simpler
>>>>> ring buffer than lttng-ust's.
>>>>>
>>>>> Still, it is not a small undertaking and it basically means maintaining
>>>>> a
>>>>> third tracer implementation.
>>>>>
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Thanks!
>>>>> Jérémie
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>>> Loïc.
>>>>>>
>>>>>> _______________________________________________
>>>>>> lttng-dev mailing list
>>>>>> lttng-dev at lists.lttng.org
>>>>>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>> Jérémie Galarneau
>>>>> EfficiOS Inc.
>>>>> http://www.efficios.com
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Jérémie Galarneau
>>> EfficiOS Inc.
>>> http://www.efficios.com
>>>
>>
>>
>>
>>
>
>
> --
> Jérémie Galarneau
> EfficiOS Inc.
> http://www.efficios.com