[ltt-dev] liburcu cache line size

Mathieu Desnoyers compudj at krystal.dyndns.org
Tue Aug 17 15:45:01 EDT 2010


* David Goulet (david.goulet at polymtl.ca) wrote:
>
>
> On 10-08-17 02:51 PM, Mathieu Desnoyers wrote:
>> * David Goulet (david.goulet at polymtl.ca) wrote:
>>> Hi,
>>>
>>> I have some doubt about the value of #define CACHE_LINE_SIZE
>>> (urcu/arch_x86.h) that is set to 128.
>>>
>>> After some research and looking on my computer, the x86 architecture
>>> seems to have most of the time 64 bytes size. On my i7 920, here's what
>>> I have :
>>>
>>> # getconf LEVEL1_DCACHE_LINESIZE
>>> 64
>>>
>>> # cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size
>>> 64
>>>
>>> Since the Intel NetBurst microarch., the Intel manual says 64 bytes also
>>> and it has not changed apparently for Nehalem.
>>>
>>> So, Mathieu, why 128 bytes? UST is using that, if it's the wrong value
>>> here for x86, it could have an effect on cache pressure since 2 lines
>>> are required for structure less then 64 bytes.
>>
>> See Linux kernel source:
>>
>> arch/x86/Kconfig.cpu
>>
>> #
>> # Define implied options from the CPU selection here
>> config X86_INTERNODE_CACHE_SHIFT
>>          int
>>          default "12" if X86_VSMP
>>          default "7" if NUMA
>>          default X86_L1_CACHE_SHIFT
>>
>> and
>>
>> config X86_L1_CACHE_SHIFT
>>          int
>>          default "7" if MPENTIUM4 || MPSC
>>          default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM ||
>> MVIAC7 || X86_GENERIC || GENERIC_CPU
>>          default "4" if X86_ELAN || M486 || M386 || MGEODEGX1
>>          default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON
>> || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX ||
>> M586TSC || M586 || MVIAC3_2 || MGEODE_LX
>>
>> So Pentium 4 seems to have 128 bytes cache lines.
>>
>
> Yep I saw that and this is why I'm asking because only NUMA, P4 and vSMP  
> machines are bigger then 64 bytes. The rest is 64 bytes (X86 generic,  
> Core 2(Nehalem), Atom).
>
> So you are saying that you prefer use 128 bytes knowing that most of X86  
> is lower or equal to 64 bytes?

Yes. The performance degradation caused by cache-line bouncing is _way_
worse than extra cache pressure.

Mathieu

>
>> Hopefully the ScaleMP vSMP machine are rare enough (they would require a
>> 4k alignment).
>>
>> NUMA is not that rare, and requires 128 bytes cache lines too.
>>
>> Can you send a patch for userspace RCU that documents this briefly in
>> urcu/arch_x86.h ? (just a summary of the info I pasted here would be
>> fine)
>>
>> Thanks,
>>
>> Mathieu
>>
>>
>>
>>
>>
>>>
>>> Thanks!
>>> --
>>> David Goulet
>>> LTTng project, DORSAL Lab.
>>>
>>> PGP/GPG : 1024D/16BD8563
>>> BE3C 672B 9331 9796 291A  14C6 4AF7 C14B 16BD 8563
>>>
>>> _______________________________________________
>>> ltt-dev mailing list
>>> ltt-dev at lists.casi.polymtl.ca
>>> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>>>
>>
>
> -- 
> David Goulet
> LTTng project, DORSAL Lab.
>
> PGP/GPG : 1024D/16BD8563
> BE3C 672B 9331 9796 291A  14C6 4AF7 C14B 16BD 8563
>

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list