[lttng-dev] ***UNCHECKED*** Re: Re: urcu workqueue thread uses 99% of cpu while workqueue is empty

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Wed Jun 22 16:52:00 EDT 2022


----- On Jun 22, 2022, at 4:28 PM, Mathieu Desnoyers mathieu.desnoyers at efficios.com wrote:

> ----- On Jun 22, 2022, at 9:19 AM, Mathieu Desnoyers
> mathieu.desnoyers at efficios.com wrote:
> 
>> ----- On Jun 22, 2022, at 3:45 AM, Minlan Wang wangminlan at szsandstone.com wrote:
>> 
>> [...]
>> 
>> Hi Minlan,
>> 
>>> --
>>> 1.8.3.1
>>> 
>>> 
>>> And the lttng session is configured to trace these events:
>>> kernel: syscall futex
>> 
>> On the kernel side, in addition to the syscall futex, I would really like to see
>> what
>> happens in the scheduler, mainly the wait/wakeup tracepoints. This can be added
>> by using:
>> 
>> lttng enable-event -k 'sched_*'
>> 
>> This should help us confirm whether we indeed have a situation where queued wake
>> ups
>> happen to wake up a wait happening only later, which is unexpected by the
>> current liburcu
>> userspace code.
>> 
>> [...]
>> 
>>> ---
>>> 
>>> The babletrace output of this session is pretty big, 6 MB in size, i put it in
>>> the attachment trace_0622.tar.bz2.
>>> Let my know if your mailbox can't handle such big attachment.
>> 
>> It would be even better if you can share the binary trace, because then it's
>> easy to
>> load it in trace compass, cut away time ranges that don't matter, and lots of
>> other
>> useful stuff.
> 
> I just found the relevant snippet of documentation in futex(5):
> 
>       FUTEX_WAIT
>              Returns 0 if the caller was woken up.  Note that a  wake-up  can
>              also  be caused by common futex usage patterns in unrelated code
>              that happened to have previously used the  futex  word's  memory
>              location  (e.g., typical futex-based implementations of Pthreads
>              mutexes can cause this under some conditions).  Therefore, call‐
>              ers should always conservatively assume that a return value of 0
>              can mean a spurious wake-up, and  use  the  futex  word's  value
>              (i.e.,  the user-space synchronization scheme) to decide whether
>              to continue to block or not.
> 
> I'm pretty sure this is what is happening here.

Here is the series of patches for review on gerrit:

remote:   https://review.lttng.org/c/userspace-rcu/+/8441 Fix: workqueue: futex wait: handle spurious futex wakeups [NEW]        
remote:   https://review.lttng.org/c/userspace-rcu/+/8442 Fix: urcu: futex wait: handle spurious futex wakeups [NEW]        
remote:   https://review.lttng.org/c/userspace-rcu/+/8443 Fix: call_rcu: futex wait: handle spurious futex wakeups [NEW]        
remote:   https://review.lttng.org/c/userspace-rcu/+/8444 Fix: urcu-wait: futex wait: handle spurious futex wakeups [NEW]        
remote:   https://review.lttng.org/c/userspace-rcu/+/8445 Fix: defer_rcu: futex wait: handle spurious futex wakeups [NEW]        
remote:   https://review.lttng.org/c/userspace-rcu/+/8446 Fix: urcu-qsbr: futex wait: handle spurious futex wakeups [NEW] 

Thanks,

Mathieu

> 
> Thanks,
> 
> Mathieu
> 
> 
>> 
>> Thanks,
>> 
>> Mathieu
>> 
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


More information about the lttng-dev mailing list