[lttng-dev] [liburcu] rculfhash autoresizing creates huge tables

Denis Yevgenyevich denis.yevgenyevich at gmail.com
Wed Mar 28 06:13:50 EDT 2018


> Actually, this is not the average chain length: the resize happens when
the first chain going
> beyond the threshold is encountered. I would expect the average chain
length to be somewhere
> below that threshold. It would be interesting if you could provide
insight into the distribution of
> your hash table bucket chain length by adding instrumentation code.
Please have a look at the attached files. I printed out histograms <chain
length: frequency>
right before and after each _do_cds_lfht_resize(). The average chain length
computed as
items / buckets exceeded 8. Each millisecond I was adding a unique item in
the table.
The test was run twice with different hash functions, siphash (64-bit) and
lookup3 (32-bit).

Chains that long are ok, I think, until tables get huge and most node
accesses
during a lookup become cache misses.

> > Perhaps you could make CHAIN_LEN_RESIZE_THRESHOLD a tunable?
> That would be a possibility, yes. Would you like to propose a patch ?
Perhaps later, too busy.

On 27 March 2018 at 19:17, Mathieu Desnoyers <mathieu.desnoyers at efficios.com
> wrote:

> (Please keep the lttng-dev mailing list in CC.)
>
> ----- On Mar 27, 2018, at 12:07 PM, Denis Yevgenyevich
> denis.yevgenyevich at gmail.com wrote:
>
> >> When both flags are set, check_resize() is only for the bucket-local
> chain
> > > length algorithm that deals
> >> with very small tables. Then, when the table size goes above that
> threshold,
> > > split-counters are used to
> >> adjust the required table size. We don't use the split-counters
> approximation of
> > > the number of elements
> >> in the table for small tables because they are not accurate enough for
> small
> > > tables. It's a tradeoff
> > > between accuracy and accounting overhead here.
> > Thanks for the explanation.
>
> >> Notice that ht_count_add() and ht_count_del() both invoke
> > > cds_lfht_resize_lazy_count(). It's that
> > > function which is responsible for the split-counter based resize
> algorithm.
> >> If both (CDS_LFHT_ACCOUNTING | CDS_LFHT_AUTO_RESIZE) flags are set, the
> table
> > > will
> > > still grow after it has gone beyond this condition.
> > That's what I overlooked. I've just repeated my test, letting it run for
> a
> > longer time.
> > After 4096 items (4-core cpu) check_resize() stopped growing the table.
>
> 4096 buckets is indeed the threshold for a 4-core cpu system.
>
> > After 32768 items ht_cound_add() finally grew the table to 32768. That
> is when
> > the check
> > if ((count >> CHAIN_LEN_RESIZE_THRESHOLD) < size)
> > went false. The average chain length might have been 8 by that time.
> Isn't it
> > too much?
>
> Actually, this is not the average chain length: the resize happens when
> the first chain going
> beyond the threshold is encountered. I would expect the average chain
> length to be somewhere
> below that threshold. It would be interesting if you could provide insight
> into the distribution of
> your hash table bucket chain length by adding instrumentation code.
>
> > Perhaps you could make CHAIN_LEN_RESIZE_THRESHOLD a tunable?
>
> That would be a possibility, yes. Would you like to propose a patch ?
>
> > Anyway, I am glad now I understand how this stuff works. Thank you again!
>
> You're welcome!
>
> Thanks,
>
> Mathieu
>
> > On 27 March 2018 at 16:02, Mathieu Desnoyers < [
> > mailto:mathieu.desnoyers at efficios.com | mathieu.desnoyers at efficios.com
> ] >
> > wrote:
>
> >> ----- On Mar 27, 2018, at 5:55 AM, Denis Yevgenyevich < [
> >> mailto:denis.yevgenyevich at gmail.com | denis.yevgenyevich at gmail.com ] >
> wrote:
>
> >>>> The "accounting" option is meant to be enabled, otherwise you're
> using the hash
> >>> > table only with
> >>> > the bucket-length resize threshold, which has the imperfections you
> noticed
> >>> Well, given that my software has to maintain precise counts of items
> inserted
> >>> into some of hashtables,
> >>> I'd like to be able to avoid any performance overhead due to
> accounting.
>
> >>>> only for small tables. Beyond a certain size, the numbers of elements
> in the
> >>> > hash table are used
> >>> > as a criterion for resize.
> >>>> Why is it unexpected that the table stops growing beyond 4096 after
> 900
> >>> > insertions ?
> >>> Ah, thanks. Turns out that the table stopped growing because of the
> check
> >>> if (count >= (1UL << (COUNT_COMMIT_ORDER + split_count_order)))
>
> >> If both (CDS_LFHT_ACCOUNTING | CDS_LFHT_AUTO_RESIZE) flags are set, the
> table
> >> will
> >> still grow after it has gone beyond this condition.
>
> >> When both flags are set, check_resize() is only for the bucket-local
> chain
> >> length algorithm that deals
> >> with very small tables. Then, when the table size goes above that
> threshold,
> >> split-counters are used to
> >> adjust the required table size. We don't use the split-counters
> approximation of
> >> the number of elements
> >> in the table for small tables because they are not accurate enough for
> small
> >> tables. It's a tradeoff
> >> between accuracy and accounting overhead here.
>
> >> Notice that ht_count_add() and ht_count_del() both invoke
> >> cds_lfht_resize_lazy_count(). It's that
> >> function which is responsible for the split-counter based resize
> algorithm.
>
> >>> I am a bit disappointed about that. Why can't we have tables that
> resize
> >>> 1. automatically
> >>> 2. from 10s to 100,000s of buckets
> >>> 3. without becoming too large too soon?
>
> >> The only thing here is that the bucket-length algorithm used for small
> tables
> >> can grow the
> >> table more quickly than you would expect if you have hash collisions.
> But it
> >> should not
> >> matter that much because it only applies to small tables. Do you
> experience "too
> >> aggressive"
> >> hash table growth even beyond the threshold selecting the
> split-counters resize
> >> algorithm ?
>
> >> That threshold is calculated as:
>
> >> (1UL << (COUNT_COMMIT_ORDER + split_count_order)
>
> >> Where COUNT_COMMIT_ORDER = 10
> >> and split_count_order = log2(nr_cpus ceiled to next power of 2)
>
> >> Assuming a 64-core machine:
>
> >> log2(64) = 6
> >> 2 ^ (10 + 6) = 65536
>
> >> Considering that you created your hash table with an initial size of 16
> buckets,
> >> specified a minimum
> >> number of buckets of 16, and a maximum number of buckets of 1 << 16
> (65536),
> >> you'll always be
> >> using the "hash-chain length" algorithm for resize on a 64-core
> machine. The
> >> split-counter algorithm
> >> would never kick in.
>
> >> You should therefore try tweaking the @max_nr_buckets parameter to a
> larger
> >> value when
> >> calling cds_lfht_new().
>
> >> How many cores do you have ?
>
> >> Let me know how is goes,
>
> >> Thanks,
>
> >> Mathieu
>
> >>> On 26 March 2018 at 18:32, Mathieu Desnoyers < [
> >>> mailto:mathieu.desnoyers at efficios.com | mathieu.desnoyers at efficios.com
> ] >
> >>> wrote:
>
> >>>> ----- On Mar 26, 2018, at 11:01 AM, Denis Yevgenyevich < [
> >>>> mailto:denis.yevgenyevich at gmail.com | denis.yevgenyevich at gmail.com ]
> > wrote:
>
> >>>>> Hello,
>
> >>>>> I think rculfhash autoresizing (as enabled by CDS_LFHT_AUTO_RESIZE)
> does not
> >>>>> behave well. Each time _cds_lfht_add finds a hash bucket with more
> than 2
> >>>>> items, it basically tries to quadruple the hash table size. But even
> with a
> >>>>> perfect hash function hash collisions are not that rare, given a
> large enough
> >>>>> table. That is, every now and then the algorithm will come across a
> 3-long
> >>>>> chain and will grow the table unnecessarily.
>
> >>>>> I created a table with cds_lfht_new(16, 16, 1<<16,
> CDS_LFHT_AUTO_RESIZE, NULL)
> >>>>> and began adding an item each 10ms. Each item had a unique key and
> the hash
> >>>>> function was very good (full 64-bit values of Siphash). Judging by
> messages I
> >>>>> added to rculfhash.c, this was happening:
> >>>>> after about 100 insertions, the table was resized to 512
> >>>>> after about 500 insertions, the table was resized to 4096
> >>>>> after about 1300 insertions, the table was resized to 16384 and
> immediately to
> >>>>> 32768
>
> >>>>> With CDS_LFHT_AUTO_RESIZE|CDS_LFHT_ACCOUNTING the behaviour was
> different but
> >>>>> still unacceptable: the table got resized to 4096 after 900
> insertions and then
> >>>>> stopped growing!
>
> >>>> The "accounting" option is meant to be enabled, otherwise you're
> using the hash
> >>>> table only with
> >>>> the bucket-length resize threshold, which has the imperfections you
> noticed.
>
> >>>> With accounting enabled, the bucket length is used as an
> approximation for when
> >>>> resize is needed
> >>>> only for small tables. Beyond a certain size, the numbers of elements
> in the
> >>>> hash table are used
> >>>> as a criterion for resize.
>
> >>>> Why is it unexpected that the table stops growing beyond 4096 after
> 900
> >>>> insertions ? How many
> >>>> more insertions did you do beyond 900 ?
>
> >>>> Thanks,
>
> >>>> Mathieu
>
> >>>>> Could someone please implement saner resize conditions, or better
> yet, add a
> >>>>> customisable hook for decisions as to when and how much should a
> table grow?
>
> >>>>> Cheers, Denis.
>
> >>>>> _______________________________________________
> >>>>> lttng-dev mailing list
> >>>>> [ mailto:lttng-dev at lists.lttng.org | lttng-dev at lists.lttng.org ]
> >>>>> [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev |
> >>>>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ]
>
> >>>> --
> >>>> Mathieu Desnoyers
> >>>> EfficiOS Inc.
> >>>> [ http://www.efficios.com/ | http://www.efficios.com ]
>
> >> --
> >> Mathieu Desnoyers
> >> EfficiOS Inc.
> >> [ http://www.efficios.com/ | http://www.efficios.com ]
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20180328/dcb0bc9c/attachment-0001.html>
-------------- next part --------------
;;; -----------------
 0:     2
 1:     3
 2:     3
 3:     4
 4:     2
 5:     2
;;; 39 items, 16 buckets, avg.len 2.44
;;; RCU GROWING 16 TO 64

 0:    34
 1:    19
 2:    11
;;; 41 items, 64 buckets, avg.len 0.64
;;; RCU RESIZED TO 64

;;; -----------------
 0:    15
 1:    16
 2:    14
 3:    13
 4:     4
 5:     2
;;; 109 items, 64 buckets, avg.len 1.70
;;; RCU GROWING 64 TO 256

 0:   415
 1:    85
 2:    10
 3:     2
;;; 111 items, 512 buckets, avg.len 0.22
;;; RCU RESIZED TO 512

;;; -----------------
 0:   295
 1:   155
 2:    49
 3:    10
 4:     1
 5:     0
 6:     2
;;; 299 items, 512 buckets, avg.len 0.58
;;; RCU GROWING 512 TO 4096

 0:  3807
 1:   279
 2:    10
;;; 299 items, 4096 buckets, avg.len 0.07
;;; RCU RESIZED TO 4096

;;; -----------------
 0:     4
 1:     9
 2:    28
 3:    85
 4:   192
 5:   297
 6:   415
 7:   544
 8:   528
 9:   596
10:   450
11:   322
12:   270
13:   161
14:    95
15:    49
16:    23
17:    11
18:    10
19:     5
20:     1
21:     0
22:     1
;;; 34771 items, 4096 buckets, avg.len 8.49
;;; RCU GROWING 4096 TO 32768

 0: 11338
 1: 12054
 2:  6363
 3:  2235
 4:   627
 5:   125
 6:    22
 7:     4
;;; 34778 items, 32768 buckets, avg.len 1.06
;;; RCU RESIZED TO 32768

;;; -----------------
 0:    11
 1:    81
 2:   357
 3:   854
 4:  1842
 5:  2968
 6:  3845
 7:  4568
 8:  4549
 9:  4211
10:  3350
11:  2337
12:  1648
13:   973
14:   584
15:   307
16:   154
17:    76
18:    30
19:    16
20:     4
21:     2
22:     0
23:     1
;;; 264060 items, 32768 buckets, avg.len 8.06
;;; RCU GROWING 32768 TO 262144

 0: 95656
 1: 96470
 2: 48596
 3: 16395
 4:  4057
 5:   808
 6:   143
 7:    16
 8:     3
;;; 264109 items, 262144 buckets, avg.len 1.01
;;; RCU RESIZED TO 262144
-------------- next part --------------
;;; -----------------
 0:     2
 1:     1
 2:     2
 3:     9
 4:     1
 5:     0
 6:     1
;;; 42 items, 16 buckets, avg.len 2.62
;;; RCU GROWING 16 TO 64

 0:    91
 1:    32
 2:     5
;;; 42 items, 128 buckets, avg.len 0.33
;;; RCU RESIZED TO 128

;;; -----------------
 0:    52
 1:    42
 2:    26
 3:     7
 4:     1
;;; 119 items, 128 buckets, avg.len 0.93
;;; RCU GROWING 128 TO 512

 0:   911
 1:   107
 2:     6
;;; 119 items, 1024 buckets, avg.len 0.12
;;; RCU RESIZED TO 1024

;;; -----------------
 0:   531
 1:   362
 2:   103
 3:    26
 4:     2
;;; 654 items, 1024 buckets, avg.len 0.64
;;; RCU GROWING 1024 TO 4096

 0:  3492
 1:   555
 2:    48
 3:     1
;;; 654 items, 4096 buckets, avg.len 0.16
;;; RCU RESIZED TO 4096

;;; -----------------
 0:     2
 1:     5
 2:    35
 3:    75
 4:   199
 5:   319
 6:   458
 7:   550
 8:   567
 9:   512
10:   428
11:   384
12:   229
13:   137
14:    89
15:    48
16:    28
17:    16
18:     8
19:     4
20:     2
21:     0
22:     0
23:     1
;;; 34435 items, 4096 buckets, avg.len 8.41
;;; RCU GROWING 4096 TO 32768

 0: 11457
 1: 11984
 2:  6410
 3:  2219
 4:   542
 5:   121
 6:    30
 7:     5
;;; 34449 items, 32768 buckets, avg.len 1.05
;;; RCU RESIZED TO 32768

;;; -----------------
 0:    10
 1:    80
 2:   336
 3:   905
 4:  1785
 5:  3019
 6:  3964
 7:  4587
 8:  4487
 9:  4093
10:  3324
11:  2470
12:  1468
13:  1008
14:   611
15:   309
16:   176
17:    77
18:    35
19:    17
20:     5
21:     1
22:     1
;;; 263868 items, 32768 buckets, avg.len 8.05
;;; RCU GROWING 32768 TO 262144

 0: 95735
 1: 96539
 2: 48544
 3: 16180
 4:  4167
 5:   807
 6:   148
 7:    20
 8:     4
;;; 263930 items, 262144 buckets, avg.len 1.01
;;; RCU RESIZED TO 262144


More information about the lttng-dev mailing list