[lttng-dev] Question about lock in synchronize_rcu implementation of URCU

Thu Apr 28 12:44:01 UTC 2016

Hi Paul and Yuxin,

On Wed, Apr 27, 2016 at 09:23:27PM -0700, Paul E. McKenney wrote:
> Try building without it and see what happens when you run the tests.
> 

I've run a 'regtest' with the following modification out of curiousity:

diff --git a/urcu.c b/urcu.c
index a5568bdbd075..9dc3c9feae56 100644
--- a/urcu.c
+++ b/urcu.c
@@ -398,8 +398,6 @@ void synchronize_rcu(void)
 	/* We won't need to wake ourself up */
 	urcu_wait_set_state(&wait, URCU_WAIT_RUNNING);
 
-	mutex_lock(&rcu_gp_lock);
-
 	/*
 	 * Move all waiters into our local queue.
 	 */
@@ -480,7 +478,6 @@ void synchronize_rcu(void)
 	smp_mb_master();
 out:
 	mutex_unlock(&rcu_registry_lock);
-	mutex_unlock(&rcu_gp_lock);
 
 	/*
 	 * Wakeup waiters only after we have completed the grace period


And guess what, the result of the test was:

Test Summary Report
-------------------
./run-urcu-tests.sh 1 (Wstat: 0 Tests: 979 Failed: 18)
  Failed tests:  30, 45, 60, 75, 90, 103, 105, 120, 135
                  150, 165, 180, 195, 210, 225, 240, 253
		  255
Files=2, Tests=996, 1039 wallclock secs ( 0.55 usr  0.04 sys + 8981.02 cusr 597.64 csys = 9579.25 CPU)
Result: FAIL

And test case 30 for example is something like:

tests/benchmark/test_urcu_mb <nreaders> <nwriters> 1 -d 0 -b 32768

and it failed because:

lt-test_urcu_mb: test_urcu.c:183: thr_reader: Assertion `*local_ptr == 8' failed.
zsh: abort (core dumped)  ~/userspace-rcu/tests/benchmark/test_urcu_mb 4 4 1 -d 0 -b 32768

So I think what was going on here was:

CPU 0 (reader)			CPU 1 (writer)					CPU 2 (writer)
===================		====================				======================
rcu_read_lock();								new = malloc(sizeof(int));
local_ptr = rcu_dereference(test_rcu_pointer); // local_ptr == old		*new = 8;
										old = rcu_xchg_pointer(&test_rcu_pointer, new);
				synchronize_rcu():
				  urcu_wait_add(); // return 0
				  urcu_move_waiters(); // @gp_waiters is empty,
				  		       // the next urcu_wait_add() will return 0

				  						synchronize_rcu():
									  	  urcu_wait_add(); // return 0

				  mutex_lock(&rcu_register_lock);
				  wait_for_readers(); // remove registered reader from @registery,
				  		      // release rcu_register_lock and wait via poll()

										  mutex_lock(&rcu_registry_lock);
										  wait_for_readers(); // @regsitery is empty! we are so lucky
										  return;

										if (old)
											free(old)
rcu_read_unlock();
assert(*local_ptr==8); // but local_ptr(i.e. old) is already freed.


So the point is there could be two writers calling synchronize_rcu() but
not returning early(both of them enter the slow path to perform a grace
period), so the rcu_gp_lock is necessary in this case.

(Cc  Mathieu)

But this is only my understanding and I'm learning the URCU code too ;-)

Regards,
Boqun


> Might well be that it is unnecessary, but I will defer to Mathieu
> on that point.
> 
> 							Thanx, Paul
> 
> On Wed, Apr 27, 2016 at 10:18:04PM -0400, Yuxin Ren wrote:
> > As they don't currently perform grace period, why do we use the rcu_gp_lock?
> > 
> > Thank you.
> > Yuxin
> > 
> > On Wed, Apr 27, 2016 at 10:08 PM, Paul E. McKenney
> > <paulmck at linux.vnet.ibm.com> wrote:
> > > On Wed, Apr 27, 2016 at 09:34:16PM -0400, Yuxin Ren wrote:
> > >> Hi,
> > >>
> > >> I am learning the URCU code.
> > >>
> > >> Why do we need rcu_gp_lock in synchronize_rcu?
> > >> https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L401
> > >>
> > >> In the comment, it says this lock ensures mutual exclusion between
> > >> threads calling synchronize_rcu().
> > >> But only the first thread added to waiter queue can proceed to detect
> > >> grace period.
> > >> How can multiple threads currently perform the grace thread?
> > >
> > > They don't concurrently perform grace periods, and it would be wasteful
> > > for them to do so.  Instead, the first one performs the grace period,
> > > and all that were waiting at the time it started get the benefit of that
> > > same grace period.
> > >
> > > Any that arrived after the first grace period performs the first
> > > grace period are served by whichever of them performs the second
> > > grace period.
> > >
> > >                                                         Thanx, Paul
> > >
> > 
> 
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <https://lists.lttng.org/pipermail/lttng-dev/attachments/20160428/ed972c9e/attachment.sig>