[lttng-dev] liburcu: LTO breaking rcu_dereference on arm64 and possibly other architectures ?

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Fri Apr 16 15:30:53 EDT 2021

----- On Apr 16, 2021, at 3:02 PM, paulmck paulmck at kernel.org wrote:
> If it can be done reasonably, I suggest also having some way for the
> person building userspace RCU to say "I know what I am doing, so do
> it with volatile rather than memory_order_consume."

Like so ?

#define CMM_ACCESS_ONCE(x) (*(__volatile__  __typeof__(x) *)&(x))

 * By defining URCU_DEREFERENCE_USE_VOLATILE, the user requires use of
 * volatile access to implement rcu_dereference rather than
 * memory_order_consume load from the C11/C++11 standards.
 * This may improve performance on weakly-ordered architectures where
 * the compiler implements memory_order_consume as a
 * memory_order_acquire, which is stricter than required by the
 * standard.
 * Note that using volatile accesses for rcu_dereference may cause
 * LTO to generate incorrectly ordered code starting from C11/C++11.

# define rcu_dereference(x)     CMM_LOAD_SHARED(x)
# if defined (__cplusplus)
#  if __cplusplus >= 201103L
#   include <atomic>
#   define rcu_dereference(x)   ((std::atomic<__typeof__(x)>)(x)).load(std::memory_order_consume)
#  else
#   define rcu_dereference(x)   CMM_LOAD_SHARED(x)
#  endif
# else
#  if (defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L)
#   include <stdatomic.h>
#   define rcu_dereference(x)   atomic_load_explicit(&(x), memory_order_consume)
#  else
#   define rcu_dereference(x)   CMM_LOAD_SHARED(x)
#  endif
# endif



Mathieu Desnoyers
EfficiOS Inc.

