[lttng-dev] ***UNCHECKED*** Re: urcu workqueue thread uses 99% of cpu while workqueue is empty

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Tue Jun 21 09:12:38 EDT 2022


----- On Jun 20, 2022, at 11:52 PM, Minlan Wang wangminlan at szsandstone.com wrote:

> Hi, Mathieu,
> On Fri, Jun 17, 2022 at 09:37:23AM -0400, Mathieu Desnoyers wrote:
>> Can you try the following patch on top of v0.12.1 + your debug patch ?
>> 
>> diff --git a/src/workqueue.c b/src/workqueue.c
>> index eed1e21e..19fc89c8 100644
>> --- a/src/workqueue.c
>> +++ b/src/workqueue.c
>> @@ -142,6 +142,7 @@ static void futex_wait(int32_t *futex)
>>  
>>  		return;
>>  	}
>> +wait:
>>  	while ((ret = futex_async(futex, FUTEX_WAIT, -1, NULL, NULL, 0))) {
>>  		err = errno;
>>  		fprintf(stderr, "%lu: wq %p: %s failed, errno %d\n",
>> @@ -161,7 +162,16 @@ static void futex_wait(int32_t *futex)
>>  
>>  	fprintf(stderr, "%lu: wq %p: %s wait return %d\n",
>>  		pthread_self(), caa_container_of(futex, struct urcu_workqueue, futex), __func__,
>>  		ret);
>> -
>> +	/*
>> +	 * Prior queued wakeups can cause futex wait to return even
>> +	 * though the futex value is still -1. If this happens, wait
>> +	 * again.
>> +	 */
>> +	if (uatomic_read(futex) == -1) {
>> +		fprintf(stderr, "%lu: wq %p: %s futex == -1, wait again\n",
>> +			pthread_self(), caa_container_of(futex, struct urcu_workqueue, futex),
>> __func__);
>> +		goto wait;
>> +	}
>>  }
>>  
>>  static void futex_wake_up(int32_t *futex)
>> 
> I tried this patch on top of my debug patch, the issue disappeared.
> The log shows that "futex_wait futex == -1, wait again" did happen several
> times.
> I put the complete log in the attachment.

This means we appear to be on the right track. Good!

> 
>> 
>> We should not rely too much on the ordering of fprintf stderr output between
>> threads,
>> because those can be interleaved in any order (each fprintf to stderr is
>> unbuffered).
>> If we want more precision in the order of events I would recommend using
>> LTTng-UST for
>> fine-grained user-space tracing. This might explain why the futex has an
>> unexpected value
>> at line 2099.
> I tried this patch on top of previous 2 patches to use lttng_ust_tracepoint in
> workqueue.c, but always failed in compile example code with this error:
> ---
> make[5]: Entering directory
> `/home/mlwang/debug/userspace-rcu/doc/examples/rculfqueue'
>  CC       cds_lfq_enqueue.o
>  CCLD     cds_lfq_enqueue
> ../../../src/.libs//liburcu-cds.so: undefined reference to
> `lttng_ust_tracepoint_provider_workqueue'
> ---
> the patch is as following:
[...]
> --- a/src/workqueue.h
> +++ b/src/workqueue.h
> @@ -28,6 +28,7 @@
> #include <pthread.h>
> 
> #include <urcu/wfcqueue.h>
> +#include "workqueue_tp.h"
> 
> #ifdef __cplusplus
> extern "C" {
> diff --git a/src/workqueue_tp.c b/src/workqueue_tp.c
> new file mode 100644
> index 0000000..a5f9586
> --- /dev/null
> +++ b/src/workqueue_tp.c
> @@ -0,0 +1,4 @@
> +#define LTTNG_UST_TRACEPOINT_CREATE_PROBES
> +#define LTTNG_UST_TRACEPOINT_DEFINE
> +
> +#include "workqueue_tp.h"
> diff --git a/src/workqueue_tp.h b/src/workqueue_tp.h
> new file mode 100644
> index 0000000..e2bce31
> --- /dev/null
> +++ b/src/workqueue_tp.h
> @@ -0,0 +1,82 @@
> +#undef	LTTNG_UST_TRACEPOINT_PROVIDER
> +#define	LTTNG_UST_TRACEPOINT_PROVIVER workqueue

If you change "PROVIVER" into "PROVIDER" here, does it fix your issue ?

If that's not it, it should be something relatively simple with the build dependencies. Let me know
if it still does not work.

[...]

> --
> 1.8.3.1
> 
> Is "perf probe" accurate in tracing the futex event here?  Maybe i can try that.

I would prefer to use lttng-ust here to keep the footprint of the tracer minimal, so we do not
change the behavior too much by calling into the kernel.

Then we'll be able to correlate traces from lttng-ust and lttng-modules (kernel tracer) and
figure out what is happening both in user-space and kernel space, especially around futex and
scheduler activity within the kernel.

Thanks,

Mathieu

> 
> B.R
> Minlan

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


More information about the lttng-dev mailing list