[lttng-dev] ThreadSanitizer: data race between urcu_mb_synchronize_rcu and urcu_adaptative_wake_up
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Wed Mar 22 10:57:58 EDT 2023
On 2023-03-22 07:01, Ondřej Surý via lttng-dev wrote:
>> On 22. 3. 2023, at 9:02, Ondřej Surý via lttng-dev <lttng-dev at lists.lttng.org> wrote:
>>
>> That's pretty much weird because the "Write" happens on stack local variable,
>> while the "Previous write" happens after futex, which lead me to the fact that
>> ThreadSanitizer doesn't intercept futex, but we can annotate the futexes:
>>
>> https://groups.google.com/g/thread-sanitizer/c/T0G_NyyZ3s4
>
> FTR neither annotating the futex with __tsan_acquire(addr) and __tsan_release(addr)
> nor falling back to compat_futex_async() for ThreadSanitizer has helped.
>
> It seems to me that TSAN still doesn't understand the synchronization between
> RCU read-critical sections and call_rcu/synchronize_rcu() as I am also getting
> following reports:
>
> Write of size 8 at 0x7b54000009c0 by thread T102:
> #0 __tsan_memset <null> (badcache_test+0x49257d) (BuildId: a7c1595d61e3ee411276cf89a536a8daefa959a3)
> #1 mem_put /home/ondrej/Projects/bind9/lib/isc/mem.c:324:3 (libisc-9.19.12-dev.so+0x7d136) (BuildId: a33cd26e483b73684928b4782627f1278c001605)
> #2 isc__mem_put /home/ondrej/Projects/bind9/lib/isc/mem.c:684:2 (libisc-9.19.12-dev.so+0x7e0c3) (BuildId: a33cd26e483b73684928b4782627f1278c001605)
> #3 bcentry_destroy_rcu /home/ondrej/Projects/bind9/lib/dns/badcache.c:163:2 (libdns-9.19.12-dev.so+0x4e071) (BuildId: 8a550b795003cd1075ff29590734c806d84e76e6)
> #4 call_rcu_thread /home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:389:5 (liburcu-mb.so.8+0x9d6b) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)
>
> Previous atomic write of size 8 at 0x7b54000009c0 by main thread (mutexes: write M0):
> #0 ___cds_wfcq_append /home/ondrej/Projects/userspace-rcu/src/../include/urcu/static/wfcqueue.h:202:2 (liburcu-mb.so.8+0xa8ae) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)
> #1 _cds_wfcq_enqueue /home/ondrej/Projects/userspace-rcu/src/../include/urcu/static/wfcqueue.h:223:9 (liburcu-mb.so.8+0xac09) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)
> #2 _call_rcu /home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:719:2 (liburcu-mb.so.8+0x604f) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)
> #3 urcu_mb_barrier /home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:932:3 (liburcu-mb.so.8+0x4d1b) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)
> #4 badcache_flush /home/ondrej/Projects/bind9/lib/dns/badcache.c:329:2 (libdns-9.19.12-dev.so+0x4d8b3) (BuildId: 8a550b795003cd1075ff29590734c806d84e76e6)
> [...]
>
> E.g. ThreadSanitizer reports a race between a place where bcentry->rcu_head is added to call_rcu() queue
> and when call_rcu callbacks are called. Annotating the bcentry with acquire/release here helps with this
> particular data race, but it does not feel right to me to add annotation at this level.
>
> The code is not very complicated there:
>
> static void
> bcentry_destroy_rcu(struct rcu_head *rcu_head) {
> dns_bcentry_t *bad = caa_container_of(rcu_head, dns_bcentry_t,
> rcu_head);
> /* __tsan_release(bad); <-- this helps */
> dns_badcache_t *bc = bad->bc;
>
> isc_mem_put(bc->mctx, bad, sizeof(*bad));
>
> dns_badcache_detach(&bc);
> }
>
> static void
> bcentry_evict(struct cds_lfht *ht, dns_bcentry_t *bad) {
> /* There can be multiple deleters now */
> if (cds_lfht_del(ht, &bad->ht_node) == 0) {
> /* __tsan_acquire(bad); <- this helps */
> call_rcu(&bad->rcu_head, bcentry_destroy_rcu);
> }
> }
>
> static void
> badcache_flush(dns_badcache_t *bc, struct cds_lfht *ht) {
> struct cds_lfht *oldht = rcu_xchg_pointer(&bc->ht, ht);
>
> synchronize_rcu();
>
> rcu_read_lock();
> dns_bcentry_t *bad = NULL;
> struct cds_lfht_iter iter;
> cds_lfht_for_each_entry (oldht, &iter, bad, ht_node) {
> bcentry_evict(oldht, bad);
> }
> rcu_read_unlock();
> rcu_barrier();
> RUNTIME_CHECK(cds_lfht_destroy(oldht, NULL) == 0);
> }
>
> Any ideas?
I suspect what happens here is that TSAN is considering the
/*
* Implicit memory barrier after uatomic_xchg() orders store to
* q->tail before store to old_tail->next.
*
* At this point, dequeuers see a NULL tail->p->next, which
* indicates that the queue is being appended to. The following
* store will append "node" to the queue from a dequeuer
* perspective.
*/
CMM_STORE_SHARED(old_tail->next, new_head);
within ___cds_wfcq_append() as racy.
This pairs with:
/*
* Waiting for enqueuer to complete enqueue and return the next node.
*/
static inline struct cds_wfcq_node *
___cds_wfcq_node_sync_next(struct cds_wfcq_node *node, int blocking)
{
struct cds_wfcq_node *next;
int attempt = 0;
/*
* Adaptative busy-looping waiting for enqueuer to complete enqueue.
*/
while ((next = CMM_LOAD_SHARED(node->next)) == NULL) {
if (___cds_wfcq_busy_wait(&attempt, blocking))
return CDS_WFCQ_WOULDBLOCK;
}
return next;
}
So the release semantic is provided by the implicit SEQ_CST barrier in:
___cds_wfcq_append():
old_tail = uatomic_xchg(&tail->p, new_tail); (release)
and the acquire semantic is provided by the implicit SEQ_CST barrier in:
___cds_wfcq_splice():
/*
* Memory barrier implied before uatomic_xchg() orders store to
* src_q->head before store to src_q->tail. This is required by
* concurrent enqueue on src_q, which exchanges the tail before
* updating the previous tail's next pointer.
*/
tail = uatomic_xchg(&src_q_tail->p, &src_q_head->node);
Notice how the release/acquire semantic is provided by tail->p, which is atomically modified _before_ we set the node->next pointer.
With this information, is there a specific annotation that would make sense ?
Thanks,
Mathieu
>
> Ondrej
> --
> Ondřej Surý (He/Him)
> ondrej at sury.org
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
More information about the lttng-dev
mailing list