librseq memory allocator portability

Mon Feb 3 09:55:57 EST 2025

On 2025-02-03 15:52, Mathieu Desnoyers via lttng-dev wrote:
> On 2025-02-02 16:48, Ondřej Surý via lttng-dev wrote:
>> Hey Mathieu,
>>
>> I’m actually thinking about using librseq for some more lightweight 
>> stuff first - like statistic counters (especially those related to 
>> memory allocations), and my first obvious question would be - how 
>> portable is the librseq memory allocator?
>>
>> I am not keen on having a code with ifdef spaghetti to support BSDs, 
>> so I think that it probably should provide some per-thread memory 
>> pools at the expense of just consuming more memory on platforms 
>> without rseq system call.
> 
> Hi Ondrej,
> 
> I have good news: the librseq mempool allocator is completely
> independent of the rseq system call. It's just a memory
> allocator.
> 
> AFAIR, the only optional dependency which is Linux-specific
> is on memfd_create for the RSEQ_MEMPOOL_POPULATE_COW_INIT
> populate policy. The RSEQ_MEMPOOL_POPULATE_COW_ZERO policy
> can be used as fallback on other architectures. It uses
> more memory on systems where only few of the possible
> cores are used and allocated memory is initialized to
> non-zero values.
> 
> Indexing per-cpu can fallback to sched_getcpu(3) if
> rseq is not available, which should be common enough.
> This is actually what the librseq rseq_current_cpu()
> does as fallback. So you could just use that static
> inline helper.
> 
> Now in terms of strategy for using rseq critical sections
> for a split-counter use-case, let's see our options:
> 
> - You'll want to use rseq_load_add_store__ptr() on your
>    fast path. It will return a negative error in case of
>    abort, or if the environment does not support rseq:
> 
>    A) either your Linux kernel does not have CONFIG_RSEQ or
>       is too old,
>    B) or your GNU libc does not have rseq support, and you
>       did not explicitly call the librseq thread registration.
>    C) or your environment does not support rseq at all
>       (e.g. BSD).
> 
> In the caller code, when handing rseq_load_add_store__ptr()
> errors, I would recommend a fallback to an atomic counter
> increment of a _second_ counter, so if the fallback is used
> in a situation caused by an rseq abort (e.g. a debugger single
> stepping within the rseq critical section) it works flawlessly.
> 
> Then you'll want to sum up the per-cpu counters for the rseq
> fast-path and for the atomic counter slow-path whenever you
> read the counter value.
> 
> We may need to tweak the librseq code a little bit to handle
> case (C) for BSDs, as this has not been a target so far, so
> we may need to add a few #ifdefs to cover that case seamlessly.
> 
> We have an internal project progressing at EfficiOS which aims
> to implement the equivalent of the Linux kernel static keys for
> userspace on various architectures. The goal is to dynamically
> change the code behavior between a no-op and a jump, thus selecting
> between rseq or atomics from a library or program constructor
> based on rseq availability.
> 
> Please let me know if anything is unclear.

One more thing: we may want to go ahead with the implementation of
fork handling with pthread_atfork which is still on the librseq
TODO list so we can eliminate our dependency on madvise MADV_DONTFORK
and MADV_WIPEONFORK which are Linux-specific.

Thanks,

Mathieu

> 
> Thanks!
> 
> Mathieu
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com