[lttng-dev] [RFC PATCH v2 10/13] Teardown apps_notify_thread before thread_manage_apps

Jérémie Galarneau jeremie.galarneau at efficios.com
Wed Dec 13 17:58:28 UTC 2017


On 18 September 2017 at 18:52, Jonathan Rajotte
<jonathan.rajotte-julien at efficios.com> wrote:
> RFC:
> This is necessary since we use the ust_app_ht_by_notify_sock to
> perform the cleanup operation. Since both ust_app_unregister and
> ust_app_ust_app_notify_unregister perform a remove of the app on the
> ust_ust_app_by_notify_sock but only ust_app_notify_unregister actually

There are some extra ust_ and ust_app_ on the last two lines ;)

> end up performing a close(call_rcu) on the socket.

a close(call_rcu) ?

>

It took me a while to wrap my head around what the problem is given this
description. Please, correct me if I'm wrong.

Two threads are performing a clean-up step based on the content of the
ust_app_ht_by_notify_sock hash table.

The apps_notify_thread, on shutdown:
  - invokes notify_sock_unregister_all()
    - iterates through the ust_app_ht_by_notify_sock hash table
      - calls ust_app_notify_sock_unregister() on every application
        notification socket
        - removes the app from the ust_app_ht_by_notify_sock hash table
        - closes the notify socket through a deferred call_rcu()


The apps_thread, on shutdown:
  - iterates through the ust_app_ht hash table
    - calls ust_app_unregister() on every application notification
      socket
      - This function's header comment mentions "The socket is already
        closed at this point so no close to sock.", by which the author
        most likely meant that "there is no need to close the socket".

        This is no longer true with your patch (#09). This thread
        only reacts to errors on the application sockets and then
 tears
        down

      - Flushes that app's metadata and buffers
      - Removes the app from the ust_app_ht_by_sock hash table
      - Removes the app from the ust_app_ht_by_notify_sock hash table
      - Removes the app from the ust_app_ht hash table


Given this, it already seems weird that both threads try to remove
the entry from the ust_app_ht_by_notify_sock hash table.

In the case where we are not shutting down,


> Other way to do fix this problem?
>
> Could we simply not remove it on a ust_app_unregister? And always defer
> to the apps_notify_thread for cleanup?
>
> Update the value in the hash table to -1 and emit a close and remove
> from the hash table if the value is -1?
>
> We could also keep a local list of fd in apps_notify_thread and use it for
> cleanup instead of relying on ust_ust_app_by_notify_sock.
>
> I'm not sure what is the best/elegant solution here. I am not a fan of
> the current solution but it working.
>
> Obviously this commit will be reworded and modified accordingly.
>
> Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien at efficios.com>
> ---
>  src/bin/lttng-sessiond/main.c | 55 ++++++++++++++++++++++++-------------------
>  1 file changed, 31 insertions(+), 24 deletions(-)
>
> diff --git a/src/bin/lttng-sessiond/main.c b/src/bin/lttng-sessiond/main.c
> index 4a2a661f..216d0da6 100644
> --- a/src/bin/lttng-sessiond/main.c
> +++ b/src/bin/lttng-sessiond/main.c
> @@ -6209,16 +6209,6 @@ int main(int argc, char **argv)
>         }
>         notification_thread_running = true;
>
> -       /* Create thread to manage application notify socket */
> -       ret = pthread_create(&apps_notify_thread, default_pthread_attr(),
> -                       ust_thread_manage_notify, (void *) NULL);
> -       if (ret) {
> -               errno = ret;
> -               PERROR("pthread_create notify");
> -               retval = -1;
> -               stop_threads();
> -               goto exit_apps_notify;
> -       }
>
>         /* Create thread to manage application socket */
>         ret = pthread_create(&apps_thread, default_pthread_attr(),
> @@ -6231,6 +6221,17 @@ int main(int argc, char **argv)
>                 goto exit_apps;
>         }
>
> +       /* Create thread to manage application notify socket */
> +       ret = pthread_create(&apps_notify_thread, default_pthread_attr(),
> +                       ust_thread_manage_notify, (void *) NULL);
> +       if (ret) {
> +               errno = ret;
> +               PERROR("pthread_create notify");
> +               retval = -1;
> +               stop_threads();
> +               goto exit_apps_notify;
> +       }
> +
>         /* Create thread to dispatch registration */
>         ret = pthread_create(&dispatch_thread, default_pthread_attr(),
>                         thread_dispatch_ust_registration, (void *) NULL);
> @@ -6358,20 +6359,6 @@ exit_reg_apps:
>         }
>
>  exit_dispatch:
> -       /* Instruct the apps thread to quit */
> -       ret = notify_thread_pipe(thread_apps_teardown_trigger_pipe[1]);
> -       if (ret < 0) {
> -               ERR("write error on thread quit pipe");
> -       }
> -
> -       ret = pthread_join(apps_thread, &status);
> -       if (ret) {
> -               errno = ret;
> -               PERROR("pthread_join apps");
> -               retval = -1;
> -       }
> -
> -exit_apps:
>         /* Instruct the apps_notify thread to quit */
>         ret = notify_thread_pipe(thread_apps_notify_teardown_trigger_pipe[1]);
>         if (ret < 0) {
> @@ -6386,6 +6373,26 @@ exit_apps:
>         }
>
>  exit_apps_notify:
> +       /*
> +        * The barrier ensure that all previous resources, notify sockets in
> +        * particular, are freed/closed.
> +        */
> +       rcu_barrier();
> +
> +       /* Instruct the apps thread to quit */
> +       ret = notify_thread_pipe(thread_apps_teardown_trigger_pipe[1]);
> +       if (ret < 0) {
> +               ERR("write error on thread quit pipe");
> +       }
> +
> +       ret = pthread_join(apps_thread, &status);
> +       if (ret) {
> +               errno = ret;
> +               PERROR("pthread_join apps");
> +               retval = -1;
> +       }
> +
> +exit_apps:
>  exit_notification:
>  exit_health:
>  exit_init_data:
> --
> 2.11.0
>



-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com


More information about the lttng-dev mailing list