[lttng-dev] RCU API usage from call_rcu callbacks?
Paul E. McKenney
paulmck at kernel.org
Wed Mar 22 10:45:28 EDT 2023
On Wed, Mar 22, 2023 at 09:57:25AM -0400, Mathieu Desnoyers wrote:
> On 2023-03-22 07:08, Ondřej Surý via lttng-dev wrote:
> > Hi,
> > the documentation is pretty silent on this, and asking here is probably going to be faster
> > than me trying to use the source to figure this out.
> > Is it legal to call_rcu() from within the call_rcu() callback?
> Yes. call_rcu callbacks can be chained.
> Note that you'll need to issue rcu_barrier() on program exit as many times as you chained call_rcu callbacks if you intend to make sure no queued callbacks still exist on program clean shutdown. See this comment above urcu_call_rcu_exit():
> * Teardown the default call_rcu worker thread if there are no queued
> * callbacks on process exit. This prevents leaking memory.
> * Here is how an application can ensure graceful teardown of this
> * worker thread:
> * - An application queuing call_rcu callbacks should invoke
> * rcu_barrier() before it exits.
> * - When chaining call_rcu callbacks, the number of calls to
> * rcu_barrier() on application exit must match at least the maximum
> * number of chained callbacks.
> * - If an application chains callbacks endlessly, it would have to be
> * modified to stop chaining callbacks when it detects an application
> * exit (e.g. with a flag), and wait for quiescence with rcu_barrier()
> * after setting that flag.
This trick can also be used to gracefully shut down in the presence
of bounded chaining using but one rcu_barrier() call.
> * - The statements above apply to a library which queues call_rcu
> * callbacks, only it needs to invoke rcu_barrier in its library
> * destructor.
> > What about the other RCU (and CDS) API calls?
> They can be unless stated otherwise. For instance, rcu_barrier() cannot be called from a call_rcu worker thread.
> > How does that interact with create_call_rcu_data()? I have <n> event loops and I am
> > initializing <n> 1:1 call_rcu helper threads as I need to do some per-thread initialization
> > as some of the destroy-like functions use random numbers (don't ask).
> As I recall, set_thread_call_rcu_data() will associate a call_rcu worker instance for the current thread. So all following call_rcu() invocations from that thread will be queued into this per-thread call_rcu queue, and handled by the call_rcu worker thread.
> But I wonder why you inherently need this 1:1 mapping, rather than using the content of the structure containing the rcu_head to figure out which per-thread data should be used ?
> If you manage to separate the context from the worker thread instances, then you could use per-cpu call_rcu worker threads, which will eventually scale even better when I integrate the liburcu call_rcu API with sys_rseq concurrency ids .
> > If it's legal to call_rcu() from call_rcu thread, which thread is going to be used?
> The call_rcu invoked from the call_rcu worker thread will queue the call_rcu callback onto the queue handled by that worker thread. It does so by setting
> URCU_TLS(thread_call_rcu_data) = crdp;
> early in call_rcu_thread(). So any chained call_rcu is handled by the same call_rcu worker thread doing the chaining, with the exception of teardown where the pending callbacks are moved to the default worker thread.
>  https://firstname.lastname@example.org/
> > Thank you,
> > Ondrej
> > --
> > Ondřej Surý (He/Him)
> > ondrej at sury.org
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev at lists.lttng.org
> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> Mathieu Desnoyers
> EfficiOS Inc.
More information about the lttng-dev