cds_lfht_new() clarification - init vs min

Thu Jul 10 10:40:29 EDT 2025

> On 10. 7. 2025, at 16:05, Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
> 
> On 2025-07-09 14:57, Ondřej Surý via lttng-dev wrote:
>> Hi,
>> as the answer to this might be useful to more people, I am asking here:
>> The cds_lfht_new documentation specifies 3 sizes.
>>  * cds_lfht_new - allocate a hash table.
>>  * @init_size: number of buckets to allocate initially. Must be power of two.
>>  * @min_nr_alloc_buckets: the minimum number of allocated buckets.
>>  *                        (must be power of two)
>>  * @max_nr_buckets: the maximum number of hash table buckets allowed.
>>  *                  (must be power of two, 0 is accepted, means
>>  *                  "infinite")
>> The max number of buckets is obvious, but the interaction between init
>> and min is confusing.
>> If I am reading the code right, then init_size < min_nr_alloc_buckets have no
>> effect, the buckets table will be at least 1 << min_nr_alloc_buckets.
> 
> You are correct that the documentation of interaction between init_size
> and min_nr_alloc_buckets should be improved. Especially the semantic
> of "allocated" vs "initialized" buckets.
> 
> Re-reading the code, here is my understanding:
> 
> - The init_size parameter is responsible for initializing a certain
>  number of buckets at hash table creation (during the call to
>  cds_lfht_new()) and for setting an initial resize target to the
>  init_size value. It allocates the buckets memory _and_ chains
>  those buckets into the hash table.
> 
> - The min_nr_alloc_buckets only acts as a lower bound below which
>  the hash table resize algorithm won't free the _memory_ used for
>  buckets below that threshold and will keep the buckets allocated even
>  when the hash table size shrinks. It does not affect the fact buckets
>  are unlinked from the hash table when it shrinks below that threshold
>  though, it only keeps their memory allocated.
> 
>  This parameter only applies when an explicit cds_lfht_resize,
>  cds_lfht_resize_lazy_count are done, or when the hash table is
>  created with the CDS_LFHT_AUTO_RESIZE flag.
> 
> So my understanding does not match yours. In case I missed something,
> what is leading you conclude that the
> "buckets table will be at least 1 << min_nr_alloc_buckets"
> at hash table creation when init_size < min_nr_alloc_buckets ?

It's not so obvious from the mmap implementation, but the mm-order has:

static
void cds_lfht_alloc_bucket_table(struct cds_lfht *ht, unsigned long order)
{
        if (order == 0) {
                ht->tbl_order[0] = ht->alloc->calloc(ht->alloc->state,
                        ht->min_nr_alloc_buckets, sizeof(struct cds_lfht_node));
                urcu_posix_assert(ht->tbl_order[0]);
        } else if (order > ht->min_alloc_buckets_order) {
                ht->tbl_order[order] = ht->alloc->calloc(ht->alloc->state,
                        1UL << (order -1), sizeof(struct cds_lfht_node));
                urcu_posix_assert(ht->tbl_order[order]);
        }
        /* Nothing to do for 0 < order && order <= ht->min_alloc_buckets_order */
}

Similar code is in the mm-chunk.  The mmap allocates largest mmap and then it marks
the pages as needed, so it is bit hidden behind mmap semantics, but it also populates
min_nr_alloc_buckets:

                /* large table */
                ht->tbl_mmap = memory_map(ht->max_nr_buckets
                        * sizeof(*ht->tbl_mmap));
                memory_populate(ht->tbl_mmap,
                        ht->min_nr_alloc_buckets * sizeof(*ht->tbl_mmap));

and cds_lfht_alloc_bucket_table(..., 0) is called when initializing the from cds_lfht_new(),
so my assumption was that the buckets table is .tbl_<impl> and that's always at least
min_nr_alloc_buckets large.

The loop in cds_lfht_create_bucket() then does not change the underlying table size,
but it doesn't initialize buckets for orders > init size & < min_nr_alloc_size.

>> But what happens if init_size > min_nr_alloc_buckets? It feels like it will work
>> as expected if you pre-populate the table, but if you use it "normally", e.g. there
>> could be single add / del, the table will shrink immediately.
> 
> In that case, cds_lfht_new would allocate and initialize buckets based
> on init_size, and set an initial resize_target value based on init_size.
> 
> For a hash table created with the CDS_LFHT_AUTO_RESIZE flag, indeed if
> the init_size > min_nr_alloc_buckets, doing a add/del will indeed
> trigger a hash table auto-resize, which will unlink and free buckets
> between min_nr_alloc_buckets and init_size, and will just unlink buckets
> (not free memory) below min_nr_alloc_buckets.
> 
> So it does not make much sense to have a large init_size value for an
> auto-resize hash table, because it will very undo initialization
> and allocation soon after creation when the first items are added
> and removed.

However, it could make sense to have init_size smaller than min_nr_alloc_size
to make the bucket initialization deferred, but until we reach min_nr... the table
will never auto-shrink.

> For a hash table created without auto-resize, then the behavior depends
> on the explicit resizes done with cds_lfht_resize,
> cds_lfht_resize_lazy_grow, and cds_lfht_resize_lazy_count, so it makes
> more sense to have a larger init_size value (even larger than
> min_nr_alloc_buckets).
> 
> Does this explanation help ?

Absolutely, I was also thinking inside the CDS_LFHT_AUTO_RESIZE flag
because it is ubiquitous in our code and you made me realize that the resizes
could be done automatically.

Ondrej
--
Ondřej Surý (He/Him)
ondrej at sury.org