[lttng-dev] rculfstack bug
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Wed Oct 10 13:53:04 EDT 2012
* Mathieu Desnoyers (mathieu.desnoyers at efficios.com) wrote:
> * Paul E. McKenney (paulmck at linux.vnet.ibm.com) wrote:
> > On Wed, Oct 10, 2012 at 07:42:15AM -0400, Mathieu Desnoyers wrote:
> > > * Lai Jiangshan (laijs at cn.fujitsu.com) wrote:
> > > > test code:
> > > > ./tests/test_urcu_lfs 100 10 10
> > > >
> > > > bug produce rate > 60%
> > > >
> > > > {{{
> > > > I didn't see any bug when "./tests/test_urcu_lfs 10 10 10" Or "./tests/test_urcu_lfs 100 100 10"
> > > > But I just test it about 5 times
> > > > }}}
> > > >
> > > > 4cores*1threads: Intel(R) Core(TM) i5 CPU 760
> > > > RCU_MB (no time to test for other rcu type)
> > > > test commit: 768fba83676f49eb73fd1d8ad452016a84c5ec2a
> > > >
> > > > I didn't see any bug when "./tests/test_urcu_mb 10 100 10"
> > > >
> > > > Sorry, I tried, but I failed to find out the root cause currently.
> > >
> > > I think I managed to narrow down the issue:
> > >
> > > 1) the master branch does not reproduce it, but commit
> > > 768fba83676f49eb73fd1d8ad452016a84c5ec2a repdroduces it about 50% of the
> > > time.
> > >
> > > 2) the main change between 768fba83676f49eb73fd1d8ad452016a84c5ec2a and
> > > current master (f94061a3df4c9eab9ac869a19e4228de54771fcb) is call_rcu
> > > moving to wfcqueue.
> > >
> > > 3) the bug always arise, for me, at the end of the 10 seconds.
> > > However, it might be simply due to the fact that most of the memory
> > > get freed at the end of program execution.
> > >
> > > 4) I've been able to get a backtrace, and it looks like we have some
> > > call_rcu callback-invokation threads still working while
> > > call_rcu_data_free() is invoked. In the backtrace, call_rcu_data_free()
> > > is nicely waiting for the next thread to stop, and during that time,
> > > two callback-invokation threads are invoking callbacks (and one of
> > > them triggers the segfault).
> >
> > Do any of the callbacks reference __thread variables from some other
> > thread? If so, those threads must refrain from exiting until after
> > such callbacks complete.
>
> The callback is a simple caa_container_of + free, usual stuff, nothing
> fancy.
Here is the fix: the bug was in call rcu. It is not required for master,
because we fixed it while moving to wfcqueue.
We were erroneously writing to the head field of the default
call_rcu_data rather than tail.
I wonder if we should simply do a new release with call_rcu using
wfcqueue and tell people to upgrade, or if we should somehow create a
stable branch with this fix.
Thoughts ?
Thanks,
Mathieu
---
diff --git a/urcu-call-rcu-impl.h b/urcu-call-rcu-impl.h
index 13b24ff..b205229 100644
--- a/urcu-call-rcu-impl.h
+++ b/urcu-call-rcu-impl.h
@@ -647,8 +647,9 @@ void call_rcu_data_free(struct call_rcu_data *crdp)
/* Create default call rcu data if need be */
(void) get_default_call_rcu_data();
cbs_endprev = (struct cds_wfq_node **)
- uatomic_xchg(&default_call_rcu_data, cbs_tail);
- *cbs_endprev = cbs;
+ uatomic_xchg(&default_call_rcu_data->cbs.tail,
+ cbs_tail);
+ _CMM_STORE_SHARED(*cbs_endprev, cbs);
uatomic_add(&default_call_rcu_data->qlen,
uatomic_read(&crdp->qlen));
wake_call_rcu_thread(default_call_rcu_data);
>
> Thanks,
>
> Mathieu
>
> >
> > Thanx, Paul
> >
> > > So I expect that commit
> > >
> > > commit 5161f31e09ce33dd79afad8d08a2372fbf1c4fbe
> > > Author: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
> > > Date: Tue Sep 25 10:50:49 2012 -0500
> > >
> > > call_rcu: use wfcqueue, eliminate false-sharing
> > >
> > > Eliminate false-sharing between call_rcu (enqueuer) and worker threads
> > > on the queue head and tail.
> > >
> > > Acked-by: Paul E. McKenney <paulmck at linux.vnet.ibm.com>
> > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
> > >
> > > Could have managed to fix the issue, or change the timing enough that it
> > > does not reproduces. I'll continue investigating.
> > >
> > > Thanks,
> > >
> > > Mathieu
> > >
> > >
> > > >
> > > > *** glibc detected *** /home/laijs/work/userspace-rcu/tests/.libs/lt-test_urcu_lfs: double free or corruption (out): 0x00007f20955dfbb0 ***
> > > > ======= Backtrace: =========
> > > > /lib64/libc.so.6[0x37ee676d63]
> > > > /home/laijs/work/userspace-rcu/tests/.libs/lt-test_urcu_lfs[0x4024f5]
> > > > /lib64/libpthread.so.0[0x37eda06ccb]
> > > > /lib64/libc.so.6(clone+0x6d)[0x37ee6e0c2d]
> > > > ======= Memory map: ========
> > > > 00400000-00405000 r-xp 00000000 08:08 6031723 /home/laijs/work/userspace-rcu/tests/.libs/lt-test_urcu_lfs
> > > > 00605000-00606000 rw-p 00005000 08:08 6031723 /home/laijs/work/userspace-rcu/tests/.libs/lt-test_urcu_lfs
> > > > 00606000-00616000 rw-p 00000000 00:00 0
> > > > 00e9c000-03482000 rw-p 00000000 00:00 0 [heap]
> > > > 37ed600000-37ed61f000 r-xp 00000000 08:01 1507421 /lib64/ld-2.13.so
> > > > 37ed81e000-37ed81f000 r--p 0001e000 08:01 1507421 /lib64/ld-2.13.so
> > > > 37ed81f000-37ed820000 rw-p 0001f000 08:01 1507421 /lib64/ld-2.13.so
> > > > 37ed820000-37ed821000 rw-p 00000000 00:00 0
> > > > 37eda00000-37eda17000 r-xp 00000000 08:01 1507427 /lib64/libpthread-2.13.so
> > > > 37eda17000-37edc16000 ---p 00017000 08:01 1507427 /lib64/libpthread-2.13.so
> > > > 37edc16000-37edc17000 r--p 00016000 08:01 1507427 /lib64/libpthread-2.13.so
> > > > 37edc17000-37edc18000 rw-p 00017000 08:01 1507427 /lib64/libpthread-2.13.so
> > > > 37edc18000-37edc1c000 rw-p 00000000 00:00 0
> > > > 37ee600000-37ee791000 r-xp 00000000 08:01 1507423 /lib64/libc-2.13.so
> > > > 37ee791000-37ee991000 ---p 00191000 08:01 1507423 /lib64/libc-2.13.so
> > > > 37ee991000-37ee995000 r--p 00191000 08:01 1507423 /lib64/libc-2.13.so
> > > > 37ee995000-37ee996000 rw-p 00195000 08:01 1507423 /lib64/libc-2.13.so
> > > > 37ee996000-37ee99c000 rw-p 00000000 00:00 0
> > > > 37f0e00000-37f0e15000 r-xp 00000000 08:01 1507437 /lib64/libgcc_s-4.5.1-20100924.so.1
> > > > 37f0e15000-37f1014000 ---p 00015000 08:01 1507437 /lib64/libgcc_s-4.5.1-20100924.so.1
> > > > 37f1014000-37f1015000 rw-p 00014000 08:01 1507437 /lib64/libgcc_s-4.5.1-20100924.so.1
> > > > 7f1ee4000000-7f1ee4029000 rw-p 00000000 00:00 0
> > > > 7f1ee4029000-7f1ee8000000 ---p 00000000 00:00 0
> > > > 7f1eec000000-7f1eee039000 rw-p 00000000 00:00 0
> > > > 7f1eee039000-7f1ef0000000 ---p 00000000 00:00 0
> > > > 7f1ef4000000-7f1ef4029000 rw-p 00000000 00:00 0
> > > > 7f1ef4029000-7f1ef8000000 ---p 00000000 00:00 0
> > > > 7f1efc000000-7f1efc029000 rw-p 00000000 00:00 0
> > > > 7f1efc029000-7f1f00000000 ---p 00000000 00:00 0
> > > > 7f1f04000000-7f1f060b8000 rw-p 00000000 00:00 0
> > > > 7f1f060b8000-7f1f08000000 ---p 00000000 00:00 0
> > > > 7f1f0c000000-7f1f0c029000 rw-p 00000000 00:00 0
> > > > 7f1f0c029000-7f1f10000000 ---p 00000000 00:00 0
> > > > 7f1f14000000-7f1f14029000 rw-p 00000000 00:00 0
> > > > 7f1f14029000-7f1f18000000 ---p 00000000 00:00 0
> > > > 7f1f1c000000-7f1f1c029000 rw-p 00000000 00:00 0
> > > > 7f1f1c029000-7f1f20000000 ---p 00000000 00:00 0
> > > > 7f1f24000000-7f1f24029000 rw-p 00000000 00:00 0
> > > > 7f1f24029000-7f1f28000000 ---p 00000000 00:00 0
> > > > 7f1f2c000000-7f1f2c029000 rw-p 00000000 00:00 0
> > > > 7f1f2c029000-7f1f30000000 ---p 00000000 00:00 0
> > > > 7f1f34000000-7f1f34029000 rw-p 00000000 00:00 0
> > > > 7f1f34029000-7f1f38000000 ---p 00000000 00:00 0
> > > > 7f1f3c000000-7f1f3c029000 rw-p 00000000 00:00 0
> > > > 7f1f3c029000-7f1f40000000 ---p 00000000 00:00 0
> > > > 7f1f44000000-7f1f44029000 rw-p 00000000 00:00 0
> > > > 7f1f44029000-7f1f48000000 ---p 00000000 00:00 0
> > > > 7f1f4c000000-7f1f4c029000 rw-p 00000000 00:00 0
> > > > 7f1f4c029000-7f1f50000000 ---p 00000000 00:00 0
> > > > 7f1f54000000-7f1f54029000 rw-p 00000000 00:00 0
> > > > 7f1f54029000-7f1f58000000 ---p 00000000 00:00 0
> > > > 7f1f5c000000-7f1f5c029000 rw-p 00000000 00:00 0
> > > > 7f1f5c029000-7f1f60000000 ---p 00000000 00:00 0
> > > > 7f1f64000000-7f1f64029000 rw-p 00000000 00:00 0
> > > > 7f1f64029000-7f1f68000000 ---p 00000000 00:00 0
> > > > 7f1f6c000000-7f1f6c029000 rw-p 00000000 00:00 0
> > > > 7f1f6c029000-7f1f70000000 ---p 00000000 00:00 0
> > > > 7f1f74000000-7f1f74029000 rw-p 00000000 00:00 0
> > > > 7f1f74029000-7f1f78000000 ---p 00000000 00:00 0
> > > > 7f1f7c000000-7f1f7c029000 rw-p 00000000 00:00 0
> > > > 7f1f7c029000-7f1f80000000 ---p 00000000 00:00 0
> > > > 7f1f84000000-7f1f84029000 rw-p 00000000 00:00 0
> > > > 7f1f84029000-7f1f88000000 ---p 00000000 00:00 0
> > > > 7f1f8c000000-7f1f8c029000 rw-p 00000000 00:00 0
> > > > 7f1f8c029000-7f1f90000000 ---p 00000000 00:00 0
> > > > 7f1f94000000-7f1f94029000 rw-p 00000000 00:00 0
> > > > 7f1f94029000-7f1f98000000 ---p 00000000 00:00 0
> > > > 7f1f9c000000-7f1f9c029000 rw-p 00000000 00:00 0
> > > > 7f1f9c029000-7f1fa0000000 ---p 00000000 00:00 0
> > > > 7f1fa4000000-7f1fa60ac000 rw-p 00000000 00:00 0
> > > > 7f1fa60ac000-7f1fa8000000 ---p 00000000 00:00 0
> > > > 7f1fac000000-7f1fac029000 rw-p 00000000 00:00 0
> > > > 7f1fac029000-7f1fb0000000 ---p 00000000 00:00 0
> > > > 7f1fb4000000-7f1fb4029000 rw-p 00000000 00:00 0
> > > > 7f1fb4029000-7f1fb8000000 ---p 00000000 00:00 0
> > > > 7f1fbc000000-7f1fbc029000 rw-p 00000000 00:00 0
> > > > 7f1fbc029000-7f1fc0000000 ---p 00000000 00:00 0
> > > > 7f1fc4000000-7f1fc4029000 rw-p 00000000 00:00 0
> > > > 7f1fc4029000-7f1fc8000000 ---p 00000000 00:00 0
> > > > 7f1fcc000000-7f1fce0a1000 rw-p 00000000 00:00 0
> > > > 7f1fce0a1000-7f1fd0000000 ---p 00000000 00:00 0
> > > > 7f1fd4000000-7f1fd4029000 rw-p 00000000 00:00 0
> > > > 7f1fd4029000-7f1fd8000000 ---p 00000000 00:00 0
> > > > 7f1fdc000000-7f1fde06b000 rw-p 00000000 00:00 0
> > > > 7f1fde06b000-7f1fe0000000 ---p 00000000 00:00 0
> > > > 7f1fe4000000-7f1fe4029000 rw-p 00000000 00:00 0
> > > > 7f1fe4029000-7f1fe8000000 ---p 00000000 00:00 0
> > > > 7f1fec000000-7f1fede38000 rw-p 00000000 00:00 0
> > > > 7f1fede38000-7f1ff0000000 ---p 00000000 00:00 0
> > > > 7f1ff4000000-7f1ff4029000 rw-p 00000000 00:00 0
> > > > 7f1ff4029000-7f1ff8000000 ---p 00000000 00:00 0
> > > > 7f1ffc000000-7f1ffc029000 rw-p 00000000 00:00 0
> > > > 7f1ffc029000-7f2000000000 ---p 00000000 00:00 0
> > > > 7f2004000000-7f20060c6000 rw-p 00000000 00:00 0
> > > > 7f20060c6000-7f2008000000 ---p 00000000 00:00 0
> > > > 7f200c000000-7f200c029000 rw-p 00000000 00:00 0
> > > > 7f200c029000-7f2010000000 ---p 00000000 00:00 0
> > > > 7f2014000000-7f2014029000 rw-p 00000000 00:00 0
> > > > 7f2014029000-7f2018000000 ---p 00000000 00:00 0
> > > > 7f201c000000-7f201c029000 rw-p 00000000 00:00 0
> > > > 7f201c029000-7f2020000000 ---p 00000000 00:00 0
> > > > 7f2024000000-7f2024029000 rw-p 00000000 00:00 0
> > > > 7f2024029000-7f2028000000 ---p 00000000 00:00 0
> > > > 7f202c000000-7f202c029000 rw-p 00000000 00:00 0
> > > > 7f202c029000-7f2030000000 ---p 00000000 00:00 0
> > > > 7f2034000000-7f2034029000 rw-p 00000000 00:00 0
> > > > 7f2034029000-7f2038000000 ---p 00000000 00:00 0
> > > > 7f203c000000-7f203c029000 rw-p 00000000 00:00 0
> > > > 7f203c029000-7f2040000000 ---p 00000000 00:00 0
> > > > 7f2044000000-7f2044029000 rw-p 00000000 00:00 0
> > > > 7f2044029000-7f2048000000 ---p 00000000 00:00 0
> > > > 7f204c000000-7f204c029000 rw-p 00000000 00:00 0
> > > > 7f204c029000-7f2050000000 ---p 00000000 00:00 0
> > > > 7f2054000000-7f2054029000 rw-p 00000000 00:00 0
> > > > 7f2054029000-7f2058000000 ---p 00000000 00:00 0
> > > > 7f205c000000-7f205c029000 rw-p 00000000 00:00 0
> > > > 7f205c029000-7f2060000000 ---p 00000000 00:00 0
> > > > 7f2064000000-7f2064029000 rw-p 00000000 00:00 0
> > > > 7f2064029000-7f2068000000 ---p 00000000 00:00 0
> > > > 7f206c000000-7f206c029000 rw-p 00000000 00:00 0
> > > > 7f206c029000-7f2070000000 ---p 00000000 00:00 0
> > > > 7f2074000000-7f2074029000 rw-p 00000000 00:00 0
> > > > 7f2074029000-7f2078000000 ---p 00000000 00:00 0
> > > > 7f207c000000-7f207e0bc000 rw-p 00000000 00:00 0
> > > > 7f207e0bc000-7f2080000000 ---p 00000000 00:00 0
> > > > 7f2084000000-7f2084029000 rw-p 00000000 00:00 0
> > > > 7f2084029000-7f2088000000 ---p 00000000 00:00 0
> > > > 7f208c000000-7f208c029000 rw-p 00000000 00:00 0
> > > > 7f208c029000-7f2090000000 ---p 00000000 00:00 0
> > > > 7f2094000000-7f20960c6000 rw-p 00000000 00:00 0
> > > > 7f20960c6000-7f2098000000 ---p 00000000 00:00 0
> > > > 7f209c000000-7f209c029000 rw-p 00000000 00:00 0
> > > > 7f209c029000-7f20a0000000 ---p 00000000 00:00 0
> > > > 7f20a4000000-7f20a4029000 rw-p 00000000 00:00 0
> > > > 7f20a4029000-7f20a8000000 ---p 00000000 00:00 0
> > > > 7f20ac000000-7f20ac029000 rw-p 00000000 00:00 0
> > > > 7f20ac029000-7f20b0000000 ---p 00000000 00:00 0
> > > > 7f20b4000000-7f20b4029000 rw-p 00000000 00:00 0
> > > > 7f20b4029000-7f20b8000000 ---p 00000000 00:00 0
> > > > 7f20bc000000-7f20bc029000 rw-p 00000000 00:00 0
> > > > 7f20bc029000-7f20c0000000 ---p 00000000 00:00 0
> > > > 7f20c4000000-7f20c4029000 rw-p 00000000 00:00 0
> > > > 7f20c4029000-7f20c8000000 ---p 00000000 00:00 0
> > > > 7f20c8ffa000-7f20c8ffb000 ---p 00000000 00:00 0
> > > > 7f20c8ffb000-7f20c97fb000 rw-p 00000000 00:00 0 [stack:10274]
> > > > 7f20c97fb000-7f20c97fc000 ---p 00000000 00:00 0
> > > > 7f20c97fc000-7f20c9ffc000 rw-p 00000000 00:00 0
> > > > 7f20c9ffc000-7f20c9ffd000 ---p 00000000 00:00 0
> > > > 7f20c9ffd000-7f20ca7fd000 rw-p 00000000 00:00 0
> > > > 7f20ca7fd000-7f20ca7fe000 ---p 00000000 00:00 0
> > > > 7f20ca7fe000-7f20caffe000 rw-p 00000000 00:00 0
> > > > 7f20cc000000-7f20cc029000 rw-p 00000000 00:00 0
> > > > 7f20cc029000-7f20d0000000 ---p 00000000 00:00 0
> > > > 7f20d4000000-7f20d4029000 rw-p 00000000 00:00 0
> > > > 7f20d4029000-7f20d8000000 ---p 00000000 00:00 0
> > > > 7f20dc000000-7f20dc029000 rw-p 00000000 00:00 0
> > > > 7f20dc029000-7f20e0000000 ---p 00000000 00:00 0
> > > > 7f210d9dd000-7f210d9de000 ---p 00000000 00:00 0
> > > > 7f210d9de000-7f210e1de000 rw-p 00000000 00:00 0 [stack:10160]
> > > > 7f210e1de000-7f210e1df000 ---p 00000000 00:00 0
> > > > 7f210e1df000-7f210e9df000 rw-p 00000000 00:00 0 [stack:10159]
> > > > 7f210e9df000-7f210e9e0000 ---p 00000000 00:00 0
> > > > 7f210e9e0000-7f210f1e0000 rw-p 00000000 00:00 0
> > > > 7f210f1e0000-7f210f1e1000 ---p 00000000 00:00 0
> > > > 7f210f1e1000-7f210f9e4000 rw-p 00000000 00:00 0
> > > > 7f210fa00000-7f210fa01000 rw-p 00000000 00:00 0
> > > > 7f210fa01000-7f210fa02000 r-xp 00000000 08:08 6029369 /home/laijs/work/userspace-rcu/.libs/liburcu-common.so.1.0.0
> > > > 7f210fa02000-7f210fc02000 ---p 00001000 08:08 6029369 /home/laijs/work/userspace-rcu/.libs/liburcu-common.so.1.0.0
> > > > 7f210fc02000-7f210fc03000 rw-p 00001000 08:08 6029369 /home/laijs/work/userspace-rcu/.libs/liburcu-common.so.1.0.0
> > > > 7f210fc03000-7f210fc04000 rw-p 00000000 00:00 0
> > > > 7f210fc04000-7f210fc0a000 r-xp 00000000 08:08 6029586 /home/laijs/work/userspace-rcu/.libs/liburcu-cds.so.1.0.0
> > > > 7f210fc0a000-7f210fe09000 ---p 00006000 08:08 6029586 /home/laijs/work/userspace-rcu/.libs/liburcu-cds.so.1.0.0
> > > > 7f210fe09000-7f210fe0a000 rw-p 00005000 08:08 6029586 /home/laijs/work/userspace-rcu/.libs/liburcu-cds.so.1.0.0
> > > > 7f210fe0a000-7f210fe0b000 rw-p 00000000 00:00 0
> > > > 7fff7c648000-7fff7c669000 rw-p 00000000 00:00 0 [stack]
> > > > 7fff7c715000-7fff7c716000 r-xp 00000000 00:00 0 [vdso]
> > > > ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
> > > >
> > > > _______________________________________________
> > > > lttng-dev mailing list
> > > > lttng-dev at lists.lttng.org
> > > > http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> > >
> > > --
> > > Mathieu Desnoyers
> > > Operating System Efficiency R&D Consultant
> > > EfficiOS Inc.
> > > http://www.efficios.com
> > >
> >
> >
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev at lists.lttng.org
> > http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list