[lttng-dev] Xeon Phi memory barriers
Simon Marchi
simon.marchi at polymtl.ca
Fri Dec 6 12:11:21 EST 2013
ping.
On 19 November 2013 10:26, Simon Marchi <simon.marchi at polymtl.ca> wrote:
> Hello there,
>
> liburcu does not build on the Intel Xeon Phi, because the chip is
> recognized as x86_64, but lacks the {s,l,m}fence instructions found on
> usual x86_64 processors. The following is taken from the Xeon Phi dev
> guide:
>
> The Intel® Xeon PhiTM coprocessor memory model is the same as that of
> the Intel® Pentium processor. The reads and writes always appear in
> programmed order at the system bus (or the ring interconnect in the
> case of the Intel® Xeon PhiTM coprocessor); the exception being that
> read misses are permitted to go ahead of buffered writes on the system
> bus when all the buffered writes are cached hits and are, therefore,
> not directed to the same address being accessed by the read miss.
>
> As a consequence of its stricter memory ordering model, the Intel®
> Xeon PhiTM coprocessor does not support the SFENCE, LFENCE, and MFENCE
> instructions that provide a more efficient way of controlling memory
> ordering on other Intel processors.
>
> While reads and writes from an Intel® Xeon PhiTM coprocessor appear in
> program order on the system bus, the compiler can still reorder
> unrelated memory operations while maintaining program order on a
> single Intel® Xeon PhiTM coprocessor (hardware thread). If software
> running on an Intel® Xeon PhiTM coprocessor is dependent on the order
> of memory operations on another Intel® Xeon PhiTM coprocessor then a
> serializing instruction (e.g., CPUID, instruction with a LOCK prefix)
> between the memory operations is required to guarantee completion of
> all memory accesses issued prior to the serializing instruction before
> any subsequent memory operations are started.
>
> (end of quote)
>
> From what I understand, it is safe to leave out any run-time memory
> barriers, but we still need barriers that prevent the compiler from
> reordering (using __asm__ __volatile__ ("":::"memory")). In
> urcu/arch/x86.h, I see that when CONFIG_RCU_HAVE_FENCE is false,
> memory barriers result in both compile-time and run-time memory
> barriers: __asm__ __volatile__ ("lock; addl $0,0(%%esp)":::"memory").
> I guess this would work for the Phi, but the lock instruction does not
> seem necessary.
>
> So, should we just set CONFIG_RCU_HAVE_FENCE to false when compiling
> for the Phi and go on with our lives, or should we add a specific
> config for this case?
>
> Simon
More information about the lttng-dev
mailing list