[lttng-dev] High memory consumption issue on RCU side

Sun Sep 25 15:06:34 UTC 2016

Hi Mathieu,

On Sun, Sep 25, 2016 at 4:10 PM, Mathieu Desnoyers
<mathieu.desnoyers at efficios.com> wrote:
> Hi,
>
> Did you enable the CDS_LFHT_ACCOUNTING flag for your hash tables at
> creation, or only CDS_LFHT_AUTO_RESIZE ?

Only CDS_LFHT_AUTO_RESIZE, with CDS_LFHT_ACCOUNTING situation with
memory seems to become much better (same effect as setting max number
of buckets limit).

> With only CDS_LFHT_AUTO_RESIZE, the algorithm used in check_resize()
> is to verify if the current chain is longer than CHAIN_LEN_RESIZE_THRESHOLD
> (which is currently 3). It effectively detects bucket collisions and
> resize the hash table accordingly.
>
> If you have both CDS_LFHT_AUTO_RESIZE | CDS_LFHT_ACCOUNTING flags set,
> then it goes as follow: for a small table of size below
> (1UL << (COUNT_COMMIT_ORDER + split_count_order)), we use the
> bucket-chain-length algorithm. This is because the accounting uses
> split-counters, and amortizes the cost of committing to the global
> counter. So it is not precise enough for small tables.
> When we are beyond the threshold, then we use the overall number of
> nodes in the hash table to calculate how we should resize it.
>
> The "resize_target" field of struct cds_lfht (in rculfhash-internal.h)
> is a good way to see the number of buckets that were requested at the
> last resize. This is not exposed in the public API though. You can
> also try compiling rculfhash with -DDEBUG, which will enable debugging
> printouts that tell you how the tables are resized. You can deduce the
> number of buckets from that information.
>
> So if you expect to have many collisions in your hash table,
> I recommend you activate the CDS_LFHT_ACCOUNTING flag.
>
> Hoping this clarifies things,

Thank you very much for explaining and for your help!

> Thanks,
>
> Mathieu
>
> ----- On Sep 24, 2016, at 2:40 PM, Evgeniy Ivanov lolkaantimat at gmail.com wrote:
>
>> All hash tables are created with 1024 initial buckets (no limit for
>> max number of buckets). Only three tables can contain at most about 5
>> 000 000 nodes, the rest (I think about 5000) tables contain at most
>> 1000-5000 nodes. Big tables have UUID key and CityHash, small tables
>> have a complicated binary key with SuperFastHash. Binary keys are the
>> same between executions, but UUID are generated on the fly and if
>> there're collisions it might explain why memory footprint varies so
>> much.
>>
>> I've set both min and max buckets limits and now RSS looks constant
>> between executions. Thank you very much for pointing to this! Do I
>> understand it correctly, that besides load factor rculfhash also
>> resizes depending on max number of nodes in any bucket? Is there any
>> way to get number of buckets (sorry if I missed it looking into API)
>> allocated by table? This would help to further troubleshoot the issue.
>>
>>
>>
>> On Sat, Sep 24, 2016 at 6:34 PM, Mathieu Desnoyers
>> <mathieu.desnoyers at efficios.com> wrote:
>>> ----- On Sep 24, 2016, at 11:22 AM, Paul E. McKenney paulmck at linux.vnet.ibm.com
>>> wrote:
>>>
>>>> On Sat, Sep 24, 2016 at 10:42:24AM +0300, Evgeniy Ivanov wrote:
>>>>> Hi Mathieu,
>>>>>
>>>>> On Sat, Sep 24, 2016 at 12:59 AM, Mathieu Desnoyers
>>>>> <mathieu.desnoyers at efficios.com> wrote:
>>>>> > ----- On Sep 22, 2016, at 3:14 PM, Evgeniy Ivanov lolkaantimat at gmail.com wrote:
>>>>> >
>>>>> >> Hi all,
>>>>> >>
>>>>> >> I'm investigating high memory usage of my program: RSS varies between
>>>>> >> executions in range 20-50 GB, though it should be determenistic. I've
>>>>> >> found that all the memory is allocated in this stack:
>>>>> >>
>>>>> >> Allocated 17673781248 bytes in 556 allocations
>>>>> >>        cds_lfht_alloc_bucket_table3     from liburcu-cds.so.2.0.0
>>>>> >>        _do_cds_lfht_resize      from liburcu-cds.so.2.0.0
>>>>> >>        do_resize_cb             from liburcu-cds.so.2.0.0
>>>>> >>        call_rcu_thread          from liburcu-qsbr.so.2.0.0
>>>>> >>        start_thread             from libpthread-2.12.so
>>>>> >>        clone                    from libc-2.12.so
>>>>> >>
>>>>> >> According pstack it should be quiescent state.  Call thread waits on syscall:
>>>>> >> syscall
>>>>> >> call_rcu_thread
>>>>> >> start_thread
>>>>> >> clone
>>>>> >>
>>>>> >> We use urcu-0.8.7, only rculfhash (QSBR). Is it some kind of leak in
>>>>> >> RCU or any chance I misuse it? What would you recommend to
>>>>> >> troubleshoot the situation?
>>>>> >
>>>>> > urcu-qsbr is the fastest flavor of urcu, but it is rather tricky to use well.
>>>>> > Make sure that:
>>>>> >
>>>>> > - Each registered thread periodically reach a quiescent state, by:
>>>>> >   - Invoking rcu_quiescent_state periodically, and
>>>>> >   - Making sure to surround any blocking for relatively large amount of
>>>>> >     time by rcu_thread_offline()/rcu_thread_online().
>>>>> >
>>>>> > In urcu-qsbr, the "default" state of threads is to be within a RCU read-side.
>>>>> > Therefore, if you omit any of the two advice above, you end up in a situation
>>>>> > where grace periods never complete, and therefore no call_rcu() callbacks can
>>>>> > be processed. This effectively acts like a big memory leak.
>>>>>
>>>>> It was the original assumption, but in memory stacks I don't see such
>>>>> allocations for my data. Instead huge allocations happen right in
>>>>> call_rcu_thread. Memory footprint for my app is about 20 GB, erasing
>>>>> RCU data is a rare operation, so almost 20 GB in rcu thread looks
>>>>> suspecios. I'll try to not erase any RCU protected data and reproduce
>>>>> the issue (complicated thing is that under memory tracer it happens
>>>>> not so often).
>>>>
>>>> Interesting.  Trying to figure out why your call_rcu_thread() would
>>>> ever allocate memory.
>>>>
>>>> Ah!  Do your RCU callbacks allocate memory?
>>>
>>> In this case yes: urculfhash allocates memory within a call rcu worker
>>> thread when a hash table resize is performed.
>>>
>>> Thanks,
>>>
>>> Mathieu
>>>
>>>>
>>>>                                                       Thanx, Paul
>>>>
>>>>> > Hoping this helps,
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > Mathieu
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Mathieu Desnoyers
>>>>> > EfficiOS Inc.
>>>>> > http://www.efficios.com
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cheers,
>>>>> Evgeniy
>>>
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> http://www.efficios.com
>>
>>
>>
>> --
>> Cheers,
>> Evgeniy
>> _______________________________________________
>> lttng-dev mailing list
>> lttng-dev at lists.lttng.org
>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Cheers,
Evgeniy