[ltt-dev] liburcu cache line size

David Goulet david.goulet at polymtl.ca
Tue Aug 17 15:58:41 EDT 2010



On 10-08-17 03:45 PM, Mathieu Desnoyers wrote:
> * David Goulet (david.goulet at polymtl.ca) wrote:
>>
>>
>> On 10-08-17 02:51 PM, Mathieu Desnoyers wrote:
>>> * David Goulet (david.goulet at polymtl.ca) wrote:
>>>> Hi,
>>>>
>>>> I have some doubt about the value of #define CACHE_LINE_SIZE
>>>> (urcu/arch_x86.h) that is set to 128.
>>>>
>>>> After some research and looking on my computer, the x86 architecture
>>>> seems to have most of the time 64 bytes size. On my i7 920, here's what
>>>> I have :
>>>>
>>>> # getconf LEVEL1_DCACHE_LINESIZE
>>>> 64
>>>>
>>>> # cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size
>>>> 64
>>>>
>>>> Since the Intel NetBurst microarch., the Intel manual says 64 bytes also
>>>> and it has not changed apparently for Nehalem.
>>>>
>>>> So, Mathieu, why 128 bytes? UST is using that, if it's the wrong value
>>>> here for x86, it could have an effect on cache pressure since 2 lines
>>>> are required for structure less then 64 bytes.
>>>
>>> See Linux kernel source:
>>>
>>> arch/x86/Kconfig.cpu
>>>
>>> #
>>> # Define implied options from the CPU selection here
>>> config X86_INTERNODE_CACHE_SHIFT
>>>           int
>>>           default "12" if X86_VSMP
>>>           default "7" if NUMA
>>>           default X86_L1_CACHE_SHIFT
>>>
>>> and
>>>
>>> config X86_L1_CACHE_SHIFT
>>>           int
>>>           default "7" if MPENTIUM4 || MPSC
>>>           default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM ||
>>> MVIAC7 || X86_GENERIC || GENERIC_CPU
>>>           default "4" if X86_ELAN || M486 || M386 || MGEODEGX1
>>>           default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON
>>> || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX ||
>>> M586TSC || M586 || MVIAC3_2 || MGEODE_LX
>>>
>>> So Pentium 4 seems to have 128 bytes cache lines.
>>>
>>
>> Yep I saw that and this is why I'm asking because only NUMA, P4 and vSMP
>> machines are bigger then 64 bytes. The rest is 64 bytes (X86 generic,
>> Core 2(Nehalem), Atom).
>>
>> So you are saying that you prefer use 128 bytes knowing that most of X86
>> is lower or equal to 64 bytes?
>
> Yes. The performance degradation caused by cache-line bouncing is _way_
> worse than extra cache pressure.
>

There is something I don't understand here. Correct me if (most likely) 
I am wrong.

How cache line bouncing is affected by the cache line size? If I 
understand correctly, cache line bounce is the problem where CPUs shares 
data and have to fetch it from CPU0 to CPU7 (between caches). And, I 
surely agree, this is costly!

However, if the size of the cache is bigger then the normal cache, you 
just loose space... For arch with 64 cache line size, you loose two line 
per structure aligned... How lowering down to 64 bytes will cause cache 
line bouncing?

Thanks for your help on that!
David

> Mathieu
>
>>
>>> Hopefully the ScaleMP vSMP machine are rare enough (they would require a
>>> 4k alignment).
>>>
>>> NUMA is not that rare, and requires 128 bytes cache lines too.
>>>
>>> Can you send a patch for userspace RCU that documents this briefly in
>>> urcu/arch_x86.h ? (just a summary of the info I pasted here would be
>>> fine)
>>>
>>> Thanks,
>>>
>>> Mathieu
>>>
>>>
>>>
>>>
>>>
>>>>
>>>> Thanks!
>>>> --
>>>> David Goulet
>>>> LTTng project, DORSAL Lab.
>>>>
>>>> PGP/GPG : 1024D/16BD8563
>>>> BE3C 672B 9331 9796 291A  14C6 4AF7 C14B 16BD 8563
>>>>
>>>> _______________________________________________
>>>> ltt-dev mailing list
>>>> ltt-dev at lists.casi.polymtl.ca
>>>> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>>>>
>>>
>>
>> --
>> David Goulet
>> LTTng project, DORSAL Lab.
>>
>> PGP/GPG : 1024D/16BD8563
>> BE3C 672B 9331 9796 291A  14C6 4AF7 C14B 16BD 8563
>>
>

-- 
David Goulet
LTTng project, DORSAL Lab.

PGP/GPG : 1024D/16BD8563
BE3C 672B 9331 9796 291A  14C6 4AF7 C14B 16BD 8563




More information about the lttng-dev mailing list