[ltt-dev] [PATCH 06/10 round10] add LFHT_MEMORY_CHUNK low level memory management

Wed Nov 23 03:11:46 EST 2011

On 11/22/2011 06:20 PM, Mathieu Desnoyers wrote:
> * Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
>> The old memory mangement is named as LFHT_MEMORY_ORDER.
>>
>> Signed-off-by: Lai Jiangshan <laijs at cn.fujitsu.com>
>> ---
>>  rculfhash.c |  116 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
>>  1 files changed, 111 insertions(+), 5 deletions(-)
>>
>> diff --git a/rculfhash.c b/rculfhash.c
>> index b7be893..ae1177f 100644
>> --- a/rculfhash.c
>> +++ b/rculfhash.c
>> @@ -172,6 +172,10 @@
>>  #define dbg_printf(fmt, args...)
>>  #endif
>>  
>> +#if !defined(LFHT_MEMORY_ORDER) && !defined(LFHT_MEMORY_CHUNK)
>> +#define LFHT_MEMORY_ORDER
>> +#endif
>> +
>>  /*
>>   * Split-counters lazily update the global counter each 1024
>>   * addition/removal. It automatically keeps track of resize required.
>> @@ -195,6 +199,8 @@
>>  #define MAX_TABLE_ORDER			64
>>  #endif
>>  
>> +#define MAX_CHUNK_TABLE			(1UL << 10)
>> +
>>  /*
>>   * Minimum number of bucket nodes to touch per thread to parallelize grow/shrink.
>>   */
>> @@ -247,6 +253,7 @@ struct rcu_table {
>>  	unsigned long resize_target;
>>  	int resize_initiated;
>>  
>> +#ifdef LFHT_MEMORY_ORDER
>>  	/*
>>  	 * Contains the per order-index-level bucket node table. The size
>>  	 * of each bucket node table is half the number of hashes contained
>> @@ -255,6 +262,7 @@ struct rcu_table {
>>  	 * levels to improve cache locality for small index orders.
>>  	 */
>>  	struct cds_lfht_node *tbl[MAX_TABLE_ORDER];
>> +#endif
>>  };
>>  
>>  /*
>> @@ -289,6 +297,16 @@ struct cds_lfht {
>>  	pthread_attr_t *resize_attr;	/* Resize threads attributes */
>>  	long count;			/* global approximate item count */
>>  	struct ht_items_count *split_count;	/* split item count */
>> +
>> +#ifdef LFHT_MEMORY_CHUNK
>> +	/*
>> +	 * Contains the bucket node chunks. The size of each bucket node
>> +	 * chunk is ->min_alloc_size(avoid to alloc chunks with different
>> +	 * size). chunks improve cache locality for small index orders
>> +	 * and improve lookup_bucket() for the archs without hardware fls().
>> +	 */
>> +	struct cds_lfht_node *tbl[0];
>> +#endif
>>  };
>>  
>>  /*
>> @@ -753,6 +771,93 @@ unsigned long _uatomic_xchg_monotonic_increase(unsigned long *ptr,
>>  	return old2;
>>  }
>>  
>> +#ifdef LFHT_MEMORY_CHUNK
>> +static
>> +struct cds_lfht *_cds_lfht_alloc(unsigned long min_alloc_size,
>> +		unsigned long max_size)
>> +{
>> +	struct cds_lfht *ht;
>> +
>> +	min_alloc_size = max(min_alloc_size, max_size / MAX_CHUNK_TABLE);
>> +	ht = calloc(1, sizeof(struct cds_lfht) +
>> +			sizeof(ht->tbl[0]) * (max_size / min_alloc_size));
> 
> Hrm, this first level of indirection quickly becomes much larger than
> the order-based table. Have you measured gains with this approach ?
> 
> Can you detail the use-case for this approach ?

I guess some times max_bucket_size/normal_bucket_size is not very high,
(I also guess it is quit real/often in practice)
so the users can use bigger min_alloc_size(bigger chunk size) and smaller
chunk pointer table.

(see "History" of memory management)

> 
> Have you measured the overhead of software fls() compared to the
> overhead of increases cache footprint of the larger table for the
> chunk-based approach ?

Sorry, I also guessed. So I just said this management is simpler and better
when there is no hardware fls(). not "better performance", let the users
make the decision.

It seems it is very hard to measure.

"History" of memory management:

I was thinking what is not good when this rculfhash is merged into linux kernel.
the bigger order memory allocation will be an obstacle. we can not allocate
big memory in kernel, we have to allocate small pieces(pages), and we have two
choices:
1) (v)map the pages as continuous order memory.
2) use discontinuous pages as chunks.

2) ===========> I got this LFHT_MEMORY_CHUNK.

1) ===========> since we need to do mapping, why not map them all continuous,
		and map them when demanded, I got LFHT_MEMORY_RESERVED.
		(it introduces a new obstacle, we require big memory space,
		 but 64BITS can provide it easier.)

The paper of Ori Shalev and Nir Shavit are actually a special LFHT_MEMORY_RESERVED.