cds_lfht_new() clarification - init vs min

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Thu Jul 10 10:05:13 EDT 2025


On 2025-07-09 14:57, Ondřej Surý via lttng-dev wrote:
> Hi,
> 
> as the answer to this might be useful to more people, I am asking here:
> 
> The cds_lfht_new documentation specifies 3 sizes.
> 
>   * cds_lfht_new - allocate a hash table.
>   * @init_size: number of buckets to allocate initially. Must be power of two.
>   * @min_nr_alloc_buckets: the minimum number of allocated buckets.
>   *                        (must be power of two)
>   * @max_nr_buckets: the maximum number of hash table buckets allowed.
>   *                  (must be power of two, 0 is accepted, means
>   *                  "infinite")
> 
> The max number of buckets is obvious, but the interaction between init
> and min is confusing.
> 
> If I am reading the code right, then init_size < min_nr_alloc_buckets have no
> effect, the buckets table will be at least 1 << min_nr_alloc_buckets.

You are correct that the documentation of interaction between init_size
and min_nr_alloc_buckets should be improved. Especially the semantic
of "allocated" vs "initialized" buckets.

Re-reading the code, here is my understanding:

- The init_size parameter is responsible for initializing a certain
   number of buckets at hash table creation (during the call to
   cds_lfht_new()) and for setting an initial resize target to the
   init_size value. It allocates the buckets memory _and_ chains
   those buckets into the hash table.

- The min_nr_alloc_buckets only acts as a lower bound below which
   the hash table resize algorithm won't free the _memory_ used for
   buckets below that threshold and will keep the buckets allocated even
   when the hash table size shrinks. It does not affect the fact buckets
   are unlinked from the hash table when it shrinks below that threshold
   though, it only keeps their memory allocated.

   This parameter only applies when an explicit cds_lfht_resize,
   cds_lfht_resize_lazy_count are done, or when the hash table is
   created with the CDS_LFHT_AUTO_RESIZE flag.

So my understanding does not match yours. In case I missed something,
what is leading you conclude that the
"buckets table will be at least 1 << min_nr_alloc_buckets"
at hash table creation when init_size < min_nr_alloc_buckets ?

> 
> But what happens if init_size > min_nr_alloc_buckets? It feels like it will work
> as expected if you pre-populate the table, but if you use it "normally", e.g. there
> could be single add / del, the table will shrink immediately.

In that case, cds_lfht_new would allocate and initialize buckets based
on init_size, and set an initial resize_target value based on init_size.

For a hash table created with the CDS_LFHT_AUTO_RESIZE flag, indeed if
the init_size > min_nr_alloc_buckets, doing a add/del will indeed
trigger a hash table auto-resize, which will unlink and free buckets
between min_nr_alloc_buckets and init_size, and will just unlink buckets
(not free memory) below min_nr_alloc_buckets.

So it does not make much sense to have a large init_size value for an
auto-resize hash table, because it will very undo initialization
and allocation soon after creation when the first items are added
and removed.

For a hash table created without auto-resize, then the behavior depends
on the explicit resizes done with cds_lfht_resize,
cds_lfht_resize_lazy_grow, and cds_lfht_resize_lazy_count, so it makes
more sense to have a larger init_size value (even larger than
min_nr_alloc_buckets).

Does this explanation help ?

Thanks,

Mathieu

> 
> Do I understand the code correctly?
> 
> Ondrej
> --
> Ondřej Surý (He/Him)
> ondrej at sury.org
> 


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


More information about the lttng-dev mailing list