[lttng-dev] [PATCH 00/11] Add support for TSAN to liburcu
Dmitry Vyukov
dvyukov at google.com
Tue May 16 04:18:21 EDT 2023
On Mon, 15 May 2023 at 22:18, Olivier Dion <odion at efficios.com> wrote:
>
> This patch set adds support for TSAN in liburcu.
Hi Olivier,
Where can I see the actual changes? Preferably side-by-side diffs with
full context. Are they (or can they be) uploaded somewhere on
github/gerrit?
Are there any remaining open questions?
Is there CI coverage with -fsanitize=thread?
> * Here are the major changes
>
> - Usage of compiler atomic builtins is added to the uatomic API. This is
> required for TSAN to understand atomic memory accesses. If the compiler
> supports such builtins, they are used by default. User can opt-out and use
> the legacy implementation of the uatomic API by using the
> `--disable-atomic-builtins' configuration option.
>
> - The CMM memory model is introduced but yet formalized. It tries to be as
> close as possible to the C11 memory model while offering primitives such as
> cmm_smp_wmb(), cmm_smp_rmb() and cmm_mb() that can't be expressed in it.
> For example, cmm_mb() can be used for ordering memory accesses to MMIO
> devices, which is out of the scope of the C11 memory model.
>
> - The CMM annotation layer is a new public API that is highly experimental and
> not guaranteed to be stable at this stage. It serves the dual purpose of
> verifying local (intra-thread) relaxed atomic accesses ordering with a
> memory barrier and global (inter-thread) relaxed atomic accesses with a
> shared state. The second purpose is necessary for TSAN to understand memory
> accesses ordering since it does not fully support thread fence yet.
>
> * CMM annotation example
>
> Consider the following pseudo-code of writer side in synchronize_rcu(). An
> acquire group is defined on the stack of the writer. Annotations are made
> onto the group to ensure ordering of relaxed memory accesses in reader_state()
> before the memory barrier at the end of synchronize_rcu(). It also helps TSAN
> to understand that the relaxed accesses in reader_state() act like acquire
> accesses because of the memory barrier in synchronize_rcu().
>
> In other words, the purpose of this annotation is to convert a group of
> load-acquire memory operations into load-relaxed memory operations followed by
> a single memory barrier. This highly benefits weakly ordered architectures by
> having a constant number of memory barriers instead of being linearly
> proportional to the number of loads. This does not benefit TSO
> architectures.
>
> ```
> enum urcu_state reader_state(unsigned long *ctr, cmm_annotate_t *acquire_group)
> {
> unsigned long v;
>
> v = uatomic_load(ctr, CMM_RELAXED);
> cmm_annotate_group_mem_acquire(acquire_group, ctr);
> // ...
> }
>
> void wait_for_readers(..., cmm_annotate_group *acquire_group)
> {
> // ...
> switch (reader_state(..., acquire_group)) {
> // ...
> }
> // ...
> }
>
> void synchronize_rcu()
> {
> cmm_annotate_define(acquire_group);
> // ...
> wait_for_readers(..., &acquire_group);
> // ...
> cmm_annotate_group_mb_acquire(&acquire_group);
> cmm_smp_mb();
> }
> ```
>
> * Known limitation
>
> The only known limitation is with the urcu-signal flavor. Indeed, TSAN
> hijacks calls to sigaction(2) and installs its own signal handler that will
> deliver the signals to the urcu handler at synchronization points. This is
> known to deadlock the urcu-signal flavor in at least one case. See commit log
> of `urcu/annotate: Add CMM annotation' for a minimal reproducer outside of
> liburcu.
>
> Therefore, we have the intention of deprecating the urcu-signal flavor in the
> future, starting by disabling it by default.
>
> Olivier Dion (11):
> configure: Add --disable-atomic-builtins option
> urcu/uatomic: Use atomic builtins if configured
> urcu/compiler: Use atomic builtins if configured
> urcu/arch/generic: Use atomic builtins if configured
> urcu/system: Use atomic builtins if configured
> urcu/uatomic: Add CMM memory model
> urcu-wait: Fix wait state load/store
> tests: Use uatomic for accessing global states
> benchmark: Use uatomic for accessing global states
> tests/unit/test_build: Quiet unused return value
> urcu/annotate: Add CMM annotation
>
> README.md | 11 ++
> configure.ac | 26 ++++
> include/Makefile.am | 4 +
> include/urcu/annotate.h | 174 ++++++++++++++++++++++++
> include/urcu/arch/generic.h | 37 +++++
> include/urcu/compiler.h | 20 ++-
> include/urcu/static/pointer.h | 40 ++----
> include/urcu/static/urcu-bp.h | 12 +-
> include/urcu/static/urcu-common.h | 8 +-
> include/urcu/static/urcu-mb.h | 11 +-
> include/urcu/static/urcu-memb.h | 26 +++-
> include/urcu/static/urcu-qsbr.h | 29 ++--
> include/urcu/system.h | 21 +++
> include/urcu/uatomic.h | 25 +++-
> include/urcu/uatomic/builtins-generic.h | 124 +++++++++++++++++
> include/urcu/uatomic/builtins-x86.h | 124 +++++++++++++++++
> include/urcu/uatomic/builtins.h | 83 +++++++++++
> include/urcu/uatomic/generic.h | 128 +++++++++++++++++
> src/rculfhash.c | 92 ++++++++-----
> src/urcu-bp.c | 17 ++-
> src/urcu-pointer.c | 9 +-
> src/urcu-qsbr.c | 31 +++--
> src/urcu-wait.h | 15 +-
> src/urcu.c | 24 ++--
> tests/benchmark/Makefile.am | 91 +++++++------
> tests/benchmark/common-states.c | 1 +
> tests/benchmark/common-states.h | 51 +++++++
> tests/benchmark/test_mutex.c | 32 +----
> tests/benchmark/test_perthreadlock.c | 32 +----
> tests/benchmark/test_rwlock.c | 32 +----
> tests/benchmark/test_urcu.c | 33 +----
> tests/benchmark/test_urcu_assign.c | 33 +----
> tests/benchmark/test_urcu_bp.c | 33 +----
> tests/benchmark/test_urcu_defer.c | 33 +----
> tests/benchmark/test_urcu_gc.c | 34 +----
> tests/benchmark/test_urcu_hash.c | 6 +-
> tests/benchmark/test_urcu_hash.h | 15 --
> tests/benchmark/test_urcu_hash_rw.c | 10 +-
> tests/benchmark/test_urcu_hash_unique.c | 10 +-
> tests/benchmark/test_urcu_lfq.c | 20 +--
> tests/benchmark/test_urcu_lfs.c | 20 +--
> tests/benchmark/test_urcu_lfs_rcu.c | 20 +--
> tests/benchmark/test_urcu_qsbr.c | 33 +----
> tests/benchmark/test_urcu_qsbr_gc.c | 34 +----
> tests/benchmark/test_urcu_wfcq.c | 22 ++-
> tests/benchmark/test_urcu_wfq.c | 20 +--
> tests/benchmark/test_urcu_wfs.c | 22 ++-
> tests/common/api.h | 12 +-
> tests/regression/rcutorture.h | 102 ++++++++++----
> tests/unit/test_build.c | 8 +-
> 50 files changed, 1227 insertions(+), 623 deletions(-)
> create mode 100644 include/urcu/annotate.h
> create mode 100644 include/urcu/uatomic/builtins-generic.h
> create mode 100644 include/urcu/uatomic/builtins-x86.h
> create mode 100644 include/urcu/uatomic/builtins.h
> create mode 100644 tests/benchmark/common-states.c
> create mode 100644 tests/benchmark/common-states.h
>
> --
> 2.39.2
>
More information about the lttng-dev
mailing list