[ltt-dev] liburcu cache line size

Mathieu Desnoyers compudj at krystal.dyndns.org
Tue Aug 17 15:54:59 EDT 2010


* Mathieu Desnoyers (compudj at krystal.dyndns.org) wrote:
> * David Goulet (david.goulet at polymtl.ca) wrote:
> >
> >
> > On 10-08-17 02:51 PM, Mathieu Desnoyers wrote:
> >> * David Goulet (david.goulet at polymtl.ca) wrote:
> >>> Hi,
> >>>
> >>> I have some doubt about the value of #define CACHE_LINE_SIZE
> >>> (urcu/arch_x86.h) that is set to 128.
> >>>
> >>> After some research and looking on my computer, the x86 architecture
> >>> seems to have most of the time 64 bytes size. On my i7 920, here's what
> >>> I have :
> >>>
> >>> # getconf LEVEL1_DCACHE_LINESIZE
> >>> 64
> >>>
> >>> # cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size
> >>> 64
> >>>
> >>> Since the Intel NetBurst microarch., the Intel manual says 64 bytes also
> >>> and it has not changed apparently for Nehalem.
> >>>
> >>> So, Mathieu, why 128 bytes? UST is using that, if it's the wrong value
> >>> here for x86, it could have an effect on cache pressure since 2 lines
> >>> are required for structure less then 64 bytes.
> >>
> >> See Linux kernel source:
> >>
> >> arch/x86/Kconfig.cpu
> >>
> >> #
> >> # Define implied options from the CPU selection here
> >> config X86_INTERNODE_CACHE_SHIFT
> >>          int
> >>          default "12" if X86_VSMP
> >>          default "7" if NUMA
> >>          default X86_L1_CACHE_SHIFT
> >>
> >> and
> >>
> >> config X86_L1_CACHE_SHIFT
> >>          int
> >>          default "7" if MPENTIUM4 || MPSC
> >>          default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM ||
> >> MVIAC7 || X86_GENERIC || GENERIC_CPU
> >>          default "4" if X86_ELAN || M486 || M386 || MGEODEGX1
> >>          default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON
> >> || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX ||
> >> M586TSC || M586 || MVIAC3_2 || MGEODE_LX
> >>
> >> So Pentium 4 seems to have 128 bytes cache lines.
> >>
> >
> > Yep I saw that and this is why I'm asking because only NUMA, P4 and vSMP  
> > machines are bigger then 64 bytes. The rest is 64 bytes (X86 generic,  
> > Core 2(Nehalem), Atom).
> >
> > So you are saying that you prefer use 128 bytes knowing that most of X86  
> > is lower or equal to 64 bytes?
> 
> Yes. The performance degradation caused by cache-line bouncing is _way_
> worse than extra cache pressure.

Oh, and by the way, given that these are arrays made of one variable per
cpu, the extra space allocated will not consume extra cache lines in any
of the CPU. We're just wasting a bit a memory here, not adding to cache
pressure.

Mathieu

> 
> Mathieu
> 
> >
> >> Hopefully the ScaleMP vSMP machine are rare enough (they would require a
> >> 4k alignment).
> >>
> >> NUMA is not that rare, and requires 128 bytes cache lines too.
> >>
> >> Can you send a patch for userspace RCU that documents this briefly in
> >> urcu/arch_x86.h ? (just a summary of the info I pasted here would be
> >> fine)
> >>
> >> Thanks,
> >>
> >> Mathieu
> >>
> >>
> >>
> >>
> >>
> >>>
> >>> Thanks!
> >>> --
> >>> David Goulet
> >>> LTTng project, DORSAL Lab.
> >>>
> >>> PGP/GPG : 1024D/16BD8563
> >>> BE3C 672B 9331 9796 291A  14C6 4AF7 C14B 16BD 8563
> >>>
> >>> _______________________________________________
> >>> ltt-dev mailing list
> >>> ltt-dev at lists.casi.polymtl.ca
> >>> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> >>>
> >>
> >
> > -- 
> > David Goulet
> > LTTng project, DORSAL Lab.
> >
> > PGP/GPG : 1024D/16BD8563
> > BE3C 672B 9331 9796 291A  14C6 4AF7 C14B 16BD 8563
> >
> 
> -- 
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com
> 
> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list