[ltt-dev] [URCU PATCH v2 2/2] caa: do not generate code for smp_rmb/smp_wmb on x86_64, smp_rmb on i686

Paolo Bonzini pbonzini at redhat.com
Tue Sep 6 02:48:32 EDT 2011


Usually we can assume no accesses to write-combining memory occur,
and also that there are no non-temporal load/stores (people would presumably
write those with assembly or intrinsics and put appropriate lfence/sfence
manually).  In this case rmb and wmb are no-ops on x86.  Define cmm_smp_rmb
and cmm_smp_wmb to be the "common" operations, while leaving cmm_rmb
and cmm_wmb in place for more sophisticated uses.

Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>
---
 urcu/arch/x86.h |   23 ++++++++++++++++++++---
 1 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/urcu/arch/x86.h b/urcu/arch/x86.h
index 9e5411f..c399c25 100644
--- a/urcu/arch/x86.h
+++ b/urcu/arch/x86.h
@@ -33,18 +33,35 @@ extern "C" {
 
 #ifdef CONFIG_RCU_HAVE_FENCE
 #define cmm_mb()    asm volatile("mfence":::"memory")
+
+/*
+ * Define cmm_rmb/cmm_wmb to "strict" barriers that may be needed when
+ * using SSE or working with I/O areas.  cmm_smp_rmb/cmm_smp_wmb are
+ * only compiler barriers, which is enough for general use.
+ */
 #define cmm_rmb()   asm volatile("lfence":::"memory")
 #define cmm_wmb()   asm volatile("sfence"::: "memory")
-#else
+
 /*
- * Some non-Intel clones support out of order store. cmm_wmb() ceases to be a
- * nop for these.
+ * IDT WinChip supports weak store ordering, and the kernel may enable it
+ * under our feet; cmm_smp_wmb() ceases to be a nop for these processors.
+ * However, this doesn't happen on any processor that has *fence instructions.
  */
+#define cmm_smp_wmb() cmm_barrier()
+#else
 #define cmm_mb()    asm volatile("lock; addl $0,0(%%esp)":::"memory")
 #define cmm_rmb()   asm volatile("lock; addl $0,0(%%esp)":::"memory")
 #define cmm_wmb()   asm volatile("lock; addl $0,0(%%esp)"::: "memory")
 #endif
 
+/*
+ * An empty cmm_smp_rmb() may not be enough on old PentiumPro multiprocessor
+ * systems, due to an erratum, but the Linux kernel says that "Even distro
+ * kernels should think twice before enabling this".  Hence never generate
+ * code for it, even on machines that have no *fence instructions.
+ */
+#define cmm_smp_rmb() cmm_barrier()
+
 #define caa_cpu_relax()	asm volatile("rep; nop" : : : "memory");
 
 #define rdtscll(val)							  \
-- 
1.7.6





More information about the lttng-dev mailing list