[lttng-dev] Deadlock in call_rcu_thread when destroy rculfhash node with nested rculfhash
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Wed Jun 7 22:05:53 UTC 2017
----- On Jun 1, 2017, at 9:01 AM, Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
> ----- On Oct 21, 2016, at 4:19 AM, Evgeniy Ivanov <i at eivanov.com> wrote:
>> On Wed, Oct 19, 2016 at 6:03 PM, Mathieu Desnoyers < [
>> mailto:mathieu.desnoyers at efficios.com | mathieu.desnoyers at efficios.com ] >
>> wrote:
>>> This is because we use call_rcu internally to trigger the hash table
>>> resize.
>>> In cds_lfht_destroy, we start by waiting for "in-flight" resize to complete.
>>> Unfortunately, this requires that call_rcu worker thread progresses. If
>>> cds_lfht_destroy is called from the call_rcu worker thread, it will wait
>>> forever.
>>> One alternative would be to implement our own worker thread scheme
>>> for the rcu HT resize rather than use the call_rcu worker thread. This
>>> would simplify cds_lfht_destroy requirements a lot.
>>> Ideally I'd like to re-use all the call_rcu work dispatch/worker handling
>>> scheme, just as a separate work queue.
>>> Thoughts ?
>> Thank you for explaining. Sounds like a plan: in our prod there is no issue with
>> having extra thread for table resizes. And nested tables is important feature.
> I finally managed to find some time to implement a solution, feedback
> would be welcome!
> Here are the RFC patches:
> https://lists.lttng.org/pipermail/lttng-dev/2017-May/027183.html
> https://lists.lttng.org/pipermail/lttng-dev/2017-May/027184.html
Just merged commits derived from those patches into liburcu master branch.
Thanks,
Mathieu
> Thanks,
> Mathieu
>>> Thanks,
>>> Mathieu
>>> ----- On Oct 19, 2016, at 6:03 AM, Evgeniy Ivanov < [ mailto:i at eivanov.com |
>>> i at eivanov.com ] > wrote:
>>>> Sorry, found partial answer in docs which state that cds_lfht_destroy should not
>>>> be called from a call_rcu thread context. Why does this limitation exists?
>>>> On Wed, Oct 19, 2016 at 12:56 PM, Evgeniy Ivanov < [ mailto:i at eivanov.com |
>>>> i at eivanov.com ] > wrote:
>>>>> Hi,
>>>>> Each node of top level rculfhash has nested rculfhash. Some thread clears the
>>>>> top level map and then uses rcu_barrier() to wait until everything is destroyed
>>>>> (it is done to check leaks). Recently it started to dead lock sometimes with
>>>>> following stacks:
>>>>> Thread1:
>>>>> __poll
>>>>> cds_lfht_destroy <---- nested map
>>>>> ...
>>>>> free_Node(rcu_head*) <----- node of top level map
>>>>> call_rcu_thread
>>>>> Thread2:
>>>>> syscall
>>>>> rcu_barrier_qsbr
>>>>> destroy_all
>>>>> main
>>>>> Did call_rcu_thread dead lock with barrier thread? Or is it some kind of
>>>>> internal deadlock because of nested maps?
>>>>> --
>>>>> Cheers,
>>>>> Evgeniy
>>>> --
>>>> Cheers,
>>>> Evgeniy
>>>> _______________________________________________
>>>> lttng-dev mailing list
>>>> [ mailto:lttng-dev at lists.lttng.org | lttng-dev at lists.lttng.org ]
>>>> [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev |
>>>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ]
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> [ http://www.efficios.com/ | http://www.efficios.com ]
>>> _______________________________________________
>>> lttng-dev mailing list
>>> [ mailto:lttng-dev at lists.lttng.org | lttng-dev at lists.lttng.org ]
>>> [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev |
>>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ]
>> --
>> Cheers,
>> Evgeniy
>> _______________________________________________
>> lttng-dev mailing list
>> lttng-dev at lists.lttng.org
>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20170607/4b688a1c/attachment.html>
More information about the lttng-dev
mailing list