[lttng-dev] [URCU PATCH 3/3] call_rcu: use wfcqueue, eliminate false-sharing

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Mon Oct 8 10:49:16 EDT 2012


* Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
> On 10/02/2012 10:16 PM, Mathieu Desnoyers wrote:
> > Eliminate false-sharing between call_rcu (enqueuer) and worker threads
> > on the queue head and tail.
> > 
> > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
> > ---
> > diff --git a/tests/Makefile.am b/tests/Makefile.am
> > index 81718bb..c92bbe6 100644
> > --- a/tests/Makefile.am
> > +++ b/tests/Makefile.am
> > @@ -30,14 +30,14 @@ if COMPAT_FUTEX
> >  COMPAT+=$(top_srcdir)/compat_futex.c
> >  endif
> >  
> > -URCU=$(top_srcdir)/urcu.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfqueue.c $(COMPAT)
> > -URCU_QSBR=$(top_srcdir)/urcu-qsbr.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfqueue.c $(COMPAT)
> > +URCU=$(top_srcdir)/urcu.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfcqueue.c $(COMPAT)
> > +URCU_QSBR=$(top_srcdir)/urcu-qsbr.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfcqueue.c $(COMPAT)
> >  # URCU_MB uses urcu.c but -DRCU_MB must be defined
> > -URCU_MB=$(top_srcdir)/urcu.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfqueue.c $(COMPAT)
> > +URCU_MB=$(top_srcdir)/urcu.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfcqueue.c $(COMPAT)
> >  # URCU_SIGNAL uses urcu.c but -DRCU_SIGNAL must be defined
> > -URCU_SIGNAL=$(top_srcdir)/urcu.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfqueue.c $(COMPAT)
> > -URCU_BP=$(top_srcdir)/urcu-bp.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfqueue.c $(COMPAT)
> > -URCU_DEFER=$(top_srcdir)/urcu.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfqueue.c $(COMPAT)
> > +URCU_SIGNAL=$(top_srcdir)/urcu.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfcqueue.c $(COMPAT)
> > +URCU_BP=$(top_srcdir)/urcu-bp.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfcqueue.c $(COMPAT)
> > +URCU_DEFER=$(top_srcdir)/urcu.c $(top_srcdir)/urcu-pointer.c $(top_srcdir)/wfcqueue.c $(COMPAT)
> >  
> >  URCU_COMMON_LIB=$(top_builddir)/liburcu-common.la
> >  URCU_LIB=$(top_builddir)/liburcu.la
> > diff --git a/urcu-call-rcu-impl.h b/urcu-call-rcu-impl.h
> > index 13b24ff..cf65992 100644
> > --- a/urcu-call-rcu-impl.h
> > +++ b/urcu-call-rcu-impl.h
> > @@ -21,6 +21,7 @@
> >   */
> >  
> >  #define _GNU_SOURCE
> > +#define _LGPL_SOURCE
> >  #include <stdio.h>
> >  #include <pthread.h>
> >  #include <signal.h>
> > @@ -35,7 +36,7 @@
> >  #include <sched.h>
> >  
> >  #include "config.h"
> > -#include "urcu/wfqueue.h"
> > +#include "urcu/wfcqueue.h"
> >  #include "urcu-call-rcu.h"
> >  #include "urcu-pointer.h"
> >  #include "urcu/list.h"
> > @@ -46,7 +47,14 @@
> >  /* Data structure that identifies a call_rcu thread. */
> >  
> >  struct call_rcu_data {
> > -	struct cds_wfq_queue cbs;
> > +	/*
> > +	 * Align the tail on cache line size to eliminate false-sharing
> > +	 * with head.
> > +	 */
> > +	struct cds_wfcq_tail __attribute__((aligned(CAA_CACHE_LINE_SIZE))) cbs_tail;
> > +	/* Alignment on cache line size will add padding here */
> > +
> > +	struct cds_wfcq_head cbs_head;
> 
> 
> wrong here. In this code, cbs_tail and cbs_head are in the same cache line.
> 
> ---
> 
> struct call_rcu_data {
> 	struct cds_wfcq_tail cbs_tail;
> 	struct cds_wfcq_head __attribute__((aligned(CAA_CACHE_LINE_SIZE))) cbs_head;
> 	/* other fields, can move some fields up to use the room between tail and head */
> };
> 
> # cat test.c
> 
> struct a {
> 	int __attribute__((aligned(64))) i;
> 	int j;
> };
> struct b {
> 	int i;
> 	int __attribute__((aligned(64))) j;
> };
> 
> void main(void)
> {
> 	printf("%d,%d\n", sizeof(struct a), sizeof(struct b));
> }
> 
> # ./a.out
> 64,128
> 

Good point! While we are there, I notice that the "qlen" count, kept for
debugging, is causing false-sharing too. I wonder if we should split
this counter in two counters: nr_enqueue and nr_dequeue, which would sit
in two different cache lines ? It's mainly Paul who cares about this
counter. Thoughts ?

Here is the fix to the problem you noticed above:

commit b9f893b69fbc31baea418794938f4eb74cc4923a
Author: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
Date:   Mon Oct 8 10:44:38 2012 -0400

    Fix urcu-call-rcu-impl.h: false-sharing
    
    > >  struct call_rcu_data {
    > > -   struct cds_wfq_queue cbs;
    > > +   /*
    > > +    * Align the tail on cache line size to eliminate false-sharing
    > > +    * with head.
    > > +    */
    > > +   struct cds_wfcq_tail __attribute__((aligned(CAA_CACHE_LINE_SIZE))) cbs_tail;
    > > +   /* Alignment on cache line size will add padding here */
    > > +
    > > +   struct cds_wfcq_head cbs_head;
    >
    >
    > wrong here. In this code, cbs_tail and cbs_head are in the same cache line.
    
    Reported-by: Lai Jiangshan <laijs at cn.fujitsu.com>
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>

Thanks!

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com



More information about the lttng-dev mailing list