[lttng-dev] urcu workqueue thread uses 99% of cpu while workqueue is empty
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Tue Jun 14 09:40:04 EDT 2022
----- On Jun 14, 2022, at 9:39 AM, Mathieu Desnoyers mathieu.desnoyers at efficios.com wrote:
> ----- On Jun 13, 2022, at 11:55 PM, Minlan Wang wangminlan at szsandstone.com
> wrote:
>
>> Hi, Mathieu,
>
> Hi Minlan,
>
> Thanks for the detailed bug report. Can I ask more precisely which commit ID
> of the userspace-rcu stable-2.12 branch you are using ? Typically a
I meant "stable-0.12" branch here.
Thanks,
Mathieu
> "userspace-rcu-latest-0.12.tar.bz2"
> gets generated from a git tree at a given point in time, but it does not give
> me enough details to know which commit it refers to.
>
> Thanks,
>
> Mathieu
>
>> We are running a CentOS 8.2 os on Intel(R) Xeon(R) CPU E5-2630 v4,
>> and using the workqueue interfaces in src/workqueue.h in
>> userspace-rcu-latest-0.12.tar.bz2.
>> Recently, we found the workqueue thread rushes cpu into 99% usage.
>> After some debuging, we found that the futex in struct urcu_workqueue got
>> into very big negative value, e.g, -12484; while the qlen, cbs_tail, and
>> cbs_head suggest that the workqueue is empty.
>> We add a watchpoint of workqueue->futex in workqueue_thread(), and got this
>> log when workqueue->futex first get into -2:
>> ...
>> Old value = -1
>> New value = 0
>> 0x00007ffff37c1d6d in futex_wake_up (futex=0x55555f74aa40) at workqueue.c:160
>> 160 in workqueue.c
>> #0 0x00007ffff37c1d6d in futex_wake_up (futex=0x55555f74aa40) at
>> workqueue.c:160
>> #1 0x00007ffff37c2737 in wake_worker_thread (workqueue=0x55555f74aa00) at
>> workqueue.c:324
>> #2 0x00007ffff37c29fb in urcu_workqueue_queue_work (workqueue=0x55555f74aa00,
>> work=0x555566e05e00, func=0x7ffff7523c90 <write_dirty_finish>) at
>> workqueue.c:3
>> 67
>> #3 0x00007ffff752c520 in aio_complete_cb (ctx=<optimized out>,
>> iocb=<optimized out>, res=<optimized out>, res2=<optimized out>) at
>> bio/aio_bio_adapter.c:152
>> #4 0x00007ffff752c696 in poll_io_complete (arg=0x555562e4f4a0) at
>> bio/aio_bio_adapter.c:289
>> #5 0x00007ffff72e6ea5 in start_thread () from /usr/lib64/libpthread.so.0
>> #6 0x00007ffff415d96d in clone () from /usr/lib64/libc.so.6
>> [Switching to Thread 0x7fffde3f3700 (LWP 821768)]
>> Hardware watchpoint 4: -location workqueue->futex
>>
>> Old value = 0
>> New value = -1
>> 0x00007ffff37c2473 in __uatomic_dec (len=4, addr=0x55555f74aa40) at
>> ../include/urcu/uatomic.h:490
>> 490 ../include/urcu/uatomic.h: No such file or directory.
>> #0 0x00007ffff37c2473 in __uatomic_dec (len=4, addr=0x55555f74aa40) at
>> ../include/urcu/uatomic.h:490
>> #1 workqueue_thread (arg=0x55555f74aa00) at workqueue.c:250
>> #2 0x00007ffff72e6ea5 in start_thread () from /usr/lib64/libpthread.so.0
>> #3 0x00007ffff415d96d in clone () from /usr/lib64/libc.so.6
>> Hardware watchpoint 4: -location workqueue->futex
>>
>> Old value = -1
>> New value = -2
>> 0x00007ffff37c2473 in __uatomic_dec (len=4, addr=0x55555f74aa40) at
>> ../include/urcu/uatomic.h:490
>> 490 in ../include/urcu/uatomic.h
>> #0 0x00007ffff37c2473 in __uatomic_dec (len=4, addr=0x55555f74aa40) at
>> ../include/urcu/uatomic.h:490
>> #1 workqueue_thread (arg=0x55555f74aa00) at workqueue.c:250
>> #2 0x00007ffff72e6ea5 in start_thread () from /usr/lib64/libpthread.so.0
>> #3 0x00007ffff415d96d in clone () from /usr/lib64/libc.so.6
>> Hardware watchpoint 4: -location workqueue->futex
>>
>> Old value = -2
>> New value = -3
>> 0x00007ffff37c2473 in __uatomic_dec (len=4, addr=0x55555f74aa40) at
>> ../include/urcu/uatomic.h:490
>> 490 in ../include/urcu/uatomic.h
>> #0 0x00007ffff37c2473 in __uatomic_dec (len=4, addr=0x55555f74aa40) at
>> ../include/urcu/uatomic.h:490
>> #1 workqueue_thread (arg=0x55555f74aa00) at workqueue.c:250
>> #2 0x00007ffff72e6ea5 in start_thread () from /usr/lib64/libpthread.so.0
>> #3 0x00007ffff415d96d in clone () from /usr/lib64/libc.so.6
>> Hardware watchpoint 4: -location workqueue->futex
>> ...
>>
>> After this, things went into wild, workqueue->futex got into bigger negative
>> value, and workqueue thread eat up the cpu it is using.
>> This ends only when workqueue->futex down flew into 0.
>>
>> Do you have any idea why this is happening, and how to fix it?
>>
>> B.R
>> Minlan Wang
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list