[ltt-dev] [UST PATCH] Make libustctl list only online pids
Mathieu Desnoyers
compudj at krystal.dyndns.org
Mon May 16 11:01:28 EDT 2011
* Nils Carlson (nils.carlson at ericsson.com) wrote:
> On 05/16/2011 04:23 PM, Mathieu Desnoyers wrote:
>> * Nils Carlson (nils.carlson at ericsson.com) wrote:
>>> Previously libustctl would list all pids. This patch changes this
>>> so only online pids are listed. This is done by appending on each
>>> socket name the mtime from the proc/<pid> directory. This way a
>>> socket can be checked to see if the appended mtime matches the
>>> mtime of the proc dir for the current pid, thus allowing us to
>>> distinguish between old and new socket files.
>>
>> Hi Nils,
>>
>> I'm OK with the approach. I have a few cosmetic comments below,
>>
>>>
>>> Signed-off-by: Nils Carlson<nils.carlson at ericsson.com>
>>> ---
>>> libust/tracectl.c | 9 +++-
>>> libustcomm/ustcomm.c | 150 ++++++++++++++++++++++++++++++++++++++++++++-----
>>> libustcomm/ustcomm.h | 7 ++
>>> libustctl/libustctl.c | 18 +-----
>>> 4 files changed, 153 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/libust/tracectl.c b/libust/tracectl.c
>>> index bc0a07c..fb9130c 100644
>>> --- a/libust/tracectl.c
>>> +++ b/libust/tracectl.c
>>> @@ -1228,12 +1228,19 @@ static struct ustcomm_sock * init_app_socket(int epoll_fd)
>>> char *dir_name, *sock_name;
>>> int result;
>>> struct ustcomm_sock *sock = NULL;
>>> + time_t mtime;
>>>
>>> dir_name = ustcomm_user_sock_dir();
>>> if (!dir_name)
>>> return NULL;
>>>
>>> - result = asprintf(&sock_name, "%s/%d", dir_name, (int)getpid());
>>> + mtime = ustcomm_pid_st_mtime(getpid());
>>> + if (!mtime) {
>>> + goto free_dir_name;
>>> + }
>>> +
>>> + result = asprintf(&sock_name, "%s/%d.%ld", dir_name,
>>> + (int)getpid(), (long)mtime);
>>> if (result< 0) {
>>> ERR("string overflow allocating socket name, "
>>> "UST thread bailing");
>>> diff --git a/libustcomm/ustcomm.c b/libustcomm/ustcomm.c
>>> index ed6d8f1..17bd9d9 100644
>>> --- a/libustcomm/ustcomm.c
>>> +++ b/libustcomm/ustcomm.c
>>> @@ -22,6 +22,7 @@
>>> #include<sys/types.h>
>>> #include<signal.h>
>>> #include<errno.h>
>>> +#include<limits.h>
>>> #include<sys/socket.h>
>>> #include<sys/un.h>
>>> #include<unistd.h>
>>> @@ -550,6 +551,125 @@ char *ustcomm_user_sock_dir(void)
>>> return sock_dir;
>>> }
>>>
>>> +static int time_and_pid_from_socket_name(char *sock_name, unsigned long *time,
>>> + pid_t *pid)
>>> +{
>>> + char *saveptr, *pid_m_time_str;
>>> + char *sock_basename = strdup(basename(sock_name));
>>> +
>>> + if (!sock_basename) {
>>> + return -1;
>>> + }
>>> +
>>> + /* This is the pid */
>>> + pid_m_time_str = strtok_r(sock_basename, ".",&saveptr);
>>> + if (!pid_m_time_str) {
>>> + goto out_err;
>>> + }
>>> +
>>> + errno = 0;
>>> + *pid = (pid_t)strtoul(pid_m_time_str, NULL, 10);
>>> + if (errno) {
>>
>> Errno is a TLS variable, while the value returned by strtoul is directly
>> accessible from registers. So I would tend to think that using errno all
>> the time will bloat our code, causing TLS accesses when not necessary.
>> And I fear that on systems lacking TLS support (such as Android), they
>> might simply use a shared variable modified concurrently across threads,
>> which could be racy. So I would prefer to use the return value of the
>> functions to trigger the error, and use the errno value only to print
>> the error message in more details. Otherwise, I'm worried that we might
>> end up either missing errors or triggering on a false-positive from the
>> wrong thread.
> Please read the man page for strtoul. Functions such as this one don't
> return anything to indicate errors, errno is the only way to check.
It returns ULONG_MAX, which is, I have to admit, a valid value, so
cannot be used to check for errors. Good point!
>
>>
>>> + goto out_err;
>>> + }
>>> +
>>> + /* This should be the time-stamp */
>>> + pid_m_time_str = strtok_r(NULL, ".",&saveptr);
>>> + if (!pid_m_time_str) {
>>> + goto out_err;
>>> + }
>>> +
>>> + errno = 0;
>>> + *time = strtoul(pid_m_time_str, NULL, 10);
>>> + if (errno) {
>>> + goto out_err;
>>> + }
>>> +
>>> + return 0;
>>> +
>>> +out_err:
>>> + free(sock_basename);
>>> + return -1;
>>> +}
>>> +
>>> +time_t ustcomm_pid_st_mtime(pid_t pid)
>>> +{
>>> + struct stat proc_stat;
>>> + char proc_name[PATH_MAX + 1];
>>> +
>>> + if (snprintf(proc_name, PATH_MAX, "/proc/%ld", (long)pid)< 0) {
>>> + return 0;
>>> + }
>>> +
>>> + if (stat(proc_name,&proc_stat)) {
>>> + return 0;
>>> + }
>>> +
>>> + return proc_stat.st_mtime;
>>> +}
>>> +
>>> +int ustcomm_is_socket_live(char *sock_name, pid_t *read_pid)
>>> +{
>>> + time_t time_from_pid;
>>> + unsigned long time_from_sock;
>>> + pid_t pid;
>>> +
>>> + if (time_and_pid_from_socket_name(sock_name,&time_from_sock,&pid)) {
>>> + return 0;
>>> + }
>>> +
>>> + if (read_pid) {
>>> + *read_pid = pid;
>>> + }
>>> +
>>> + time_from_pid = ustcomm_pid_st_mtime(pid);
>>> + if (!time_from_pid) {
>>> + return 0;
>>> + }
>>> +
>>> + if (((unsigned long)time_from_pid) == time_from_sock) {
>>
>> Please use: (unsigned long) time_from_pid (with the space) (typical in
>> kernel coding style, which we try to follow).
>
> Done, though as usual this isn't consistent with already existing code.
Yeah.. I'll probably do a cleanup at some point this summer.
>>
>>> + return 1;
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static int ustcomm_get_sock_name(char *dir_name, pid_t pid, char *sock_name)
>>> +{
>>> + struct dirent *dirent;
>>> + char sock_path_base[101];
>>> + int len;
>>> + DIR *dir = opendir(dir_name);
>>> +
>>> + snprintf(sock_path_base, 100, "%ld.", (long)pid);
>>> + len = strlen(sock_path_base);
>>> +
>>> + while ((dirent = readdir(dir))) {
>>> + if (!strcmp(dirent->d_name, ".") ||
>>> + !strcmp(dirent->d_name, "..") ||
>>> + !strcmp(dirent->d_name, "ust-consumer") ||
>>> + dirent->d_type == DT_DIR ||
>>> + strncmp(dirent->d_name, sock_path_base, len)) {
>>> + continue;
>>> + }
>>> +
>>> + if (ustcomm_is_socket_live(dirent->d_name, NULL)) {
>>> + printf("found %s\n", dirent->d_name);
>>
>> Should this message be only printed in verbose mode ?
>
> Darn, debug! :-) Will fix the rest.
Thanks!
Mathieu
>
> /Nils
>
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list