[lttng-dev] [PATCH lttng-tools] Fix: consumer-stream: use-after-free of metadata bucket
Vincent Whitchurch
vincent.whitchurch at axis.com
Wed Mar 2 04:27:30 EST 2022
On Tue, Mar 01, 2022 at 06:19:23PM +0100, Jérémie Galarneau wrote:
> Thanks a lot for reporting the problem. If I understand the ASAN
> report correctly, the stream itself will also be double free'd, so
> I don't think this is the complete fix.
Yeah, it looked odd that consumer_stream_destroy() is called recursively
on the same stream but AFAICS the code's been like this for a while so I
assumed it was on purpose, and only the metadata bucket stuff was
relatively new. ASAN doesn't detect any double frees of the stream
itself, but I guess calling call_rcu(..., free_stream_rcu) twice on the
same stream is not expected behaviour and could lead to other problems.
> There definitely seems to be a problem with regards to the ownership
> of the metadata channel vs stream. Let me look into it.
Great, thank you!
> I see that you fall into a case where the metadata setup fails,
> can you share more info about how this can be reproduced?
In the core dump I received (on v2.12.4), consumer_stream_destroy() was
called from the error label in setup_metadata and ret was set to
LTTCOMM_CONSUMERD_ERROR_METADATA. So consumer_send_relayd_stream() had
returned an error. I only had the core dump and no other logs, so I did
not know which of the paths inside consumer_send_relayd_stream() had
failed, but since I was primarily interested in fixing the crash itself
I simply forced this code path to be taken:
diff --git a/src/common/ust-consumer/ust-consumer.c b/src/common/ust-consumer/ust-consumer.c
index fa1c71299..97ed59632 100644
--- a/src/common/ust-consumer/ust-consumer.c
+++ b/src/common/ust-consumer/ust-consumer.c
@@ -908,8 +908,7 @@ static int setup_metadata(struct lttng_consumer_local_data *ctx, uint64_t key)
/* Send metadata stream to relayd if needed. */
if (metadata->metadata_stream->net_seq_idx != (uint64_t) -1ULL) {
- ret = consumer_send_relayd_stream(metadata->metadata_stream,
- metadata->pathname);
+ ret = -1;
if (ret < 0) {
ret = LTTCOMM_CONSUMERD_ERROR_METADATA;
goto error;
With the above patch, I could easily reproduce the use-after-free using
the following steps on the latest release v2.13.4, and it was clear that
this use-after-free was the cause of the original core dump on the older
release too.
Build with ASAN:
lttng-tools$ LDFLAGS=-fsanitize=address CFLAGS=-fsanitize=address ./configure
Shell #1:
lttng-ust$ tests/compile/api0/hello/hello 10000
Shell #2:
lttng-tools$ ASAN_OPTIONS=detect_odr_violation=0 ./src/bin/lttng-sessiond/lttng-sessiond
Shell #3:
lttng-tools$ export ASAN_OPTIONS=detect_odr_violation=0
lttng-tools$ ./src/bin/lttng/lttng create --live && ./src/bin/lttng/lttng enable-event --userspace 1 && ./src/bin/lttng/lttng start && sleep 1 && ./src/bin/lttng/lttng stop
The ASAN splat should be seen in shell #2. Note that you may have to
run the command in shell #3 a couple of times since
LTTNG_CONSUMER_SETUP_METADATA doesn't seem to be sent every time.
More information about the lttng-dev
mailing list