[ltt-dev] [UST PATCH] Make libustctl list only online pids

Mathieu Desnoyers compudj at krystal.dyndns.org
Mon May 16 11:01:28 EDT 2011


* Nils Carlson (nils.carlson at ericsson.com) wrote:
> On 05/16/2011 04:23 PM, Mathieu Desnoyers wrote:
>> * Nils Carlson (nils.carlson at ericsson.com) wrote:
>>> Previously libustctl would list all pids. This patch changes this
>>> so only online pids are listed. This is done by appending on each
>>> socket name the mtime from the proc/<pid>  directory. This way a
>>> socket can be checked to see if the appended mtime matches the
>>> mtime of the proc dir for the current pid, thus allowing us to
>>> distinguish between old and new socket files.
>>
>> Hi Nils,
>>
>> I'm OK with the approach. I have a few cosmetic comments below,
>>
>>>
>>> Signed-off-by: Nils Carlson<nils.carlson at ericsson.com>
>>> ---
>>>   libust/tracectl.c     |    9 +++-
>>>   libustcomm/ustcomm.c  |  150 ++++++++++++++++++++++++++++++++++++++++++++-----
>>>   libustcomm/ustcomm.h  |    7 ++
>>>   libustctl/libustctl.c |   18 +-----
>>>   4 files changed, 153 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/libust/tracectl.c b/libust/tracectl.c
>>> index bc0a07c..fb9130c 100644
>>> --- a/libust/tracectl.c
>>> +++ b/libust/tracectl.c
>>> @@ -1228,12 +1228,19 @@ static struct ustcomm_sock * init_app_socket(int epoll_fd)
>>>   	char *dir_name, *sock_name;
>>>   	int result;
>>>   	struct ustcomm_sock *sock = NULL;
>>> +	time_t mtime;
>>>
>>>   	dir_name = ustcomm_user_sock_dir();
>>>   	if (!dir_name)
>>>   		return NULL;
>>>
>>> -	result = asprintf(&sock_name, "%s/%d", dir_name, (int)getpid());
>>> +	mtime = ustcomm_pid_st_mtime(getpid());
>>> +	if (!mtime) {
>>> +		goto free_dir_name;
>>> +	}
>>> +
>>> +	result = asprintf(&sock_name, "%s/%d.%ld", dir_name,
>>> +			  (int)getpid(), (long)mtime);
>>>   	if (result<  0) {
>>>   		ERR("string overflow allocating socket name, "
>>>   		    "UST thread bailing");
>>> diff --git a/libustcomm/ustcomm.c b/libustcomm/ustcomm.c
>>> index ed6d8f1..17bd9d9 100644
>>> --- a/libustcomm/ustcomm.c
>>> +++ b/libustcomm/ustcomm.c
>>> @@ -22,6 +22,7 @@
>>>   #include<sys/types.h>
>>>   #include<signal.h>
>>>   #include<errno.h>
>>> +#include<limits.h>
>>>   #include<sys/socket.h>
>>>   #include<sys/un.h>
>>>   #include<unistd.h>
>>> @@ -550,6 +551,125 @@ char *ustcomm_user_sock_dir(void)
>>>   	return sock_dir;
>>>   }
>>>
>>> +static int time_and_pid_from_socket_name(char *sock_name, unsigned long *time,
>>> +					 pid_t *pid)
>>> +{
>>> +	char *saveptr, *pid_m_time_str;
>>> +	char *sock_basename = strdup(basename(sock_name));
>>> +
>>> +	if (!sock_basename) {
>>> +		return -1;
>>> +	}
>>> +
>>> +	/* This is the pid */
>>> +	pid_m_time_str = strtok_r(sock_basename, ".",&saveptr);
>>> +	if (!pid_m_time_str) {
>>> +		goto out_err;
>>> +	}
>>> +
>>> +	errno = 0;
>>> +	*pid = (pid_t)strtoul(pid_m_time_str, NULL, 10);
>>> +	if (errno) {
>>
>> Errno is a TLS variable, while the value returned by strtoul is directly
>> accessible from registers. So I would tend to think that using errno all
>> the time will bloat our code, causing TLS accesses when not necessary.
>> And I fear that on systems lacking TLS support (such as Android), they
>> might simply use a shared variable modified concurrently across threads,
>> which could be racy. So I would prefer to use the return value of the
>> functions to trigger the error, and use the errno value only to print
>> the error message in more details. Otherwise, I'm worried that we might
>> end up either missing errors or triggering on a false-positive from the
>> wrong thread.
> Please read the man page for strtoul. Functions such as this one don't  
> return anything to indicate errors, errno is the only way to check.

It returns ULONG_MAX, which is, I have to admit, a valid value, so
cannot be used to check for errors. Good point!

>
>>
>>> +		goto out_err;
>>> +	}
>>> +
>>> +	/* This should be the time-stamp */
>>> +	pid_m_time_str = strtok_r(NULL, ".",&saveptr);
>>> +	if (!pid_m_time_str) {
>>> +		goto out_err;
>>> +	}
>>> +
>>> +	errno = 0;
>>> +	*time = strtoul(pid_m_time_str, NULL, 10);
>>> +	if (errno) {
>>> +		goto out_err;
>>> +	}
>>> +
>>> +	return 0;
>>> +
>>> +out_err:
>>> +	free(sock_basename);
>>> +	return -1;
>>> +}
>>> +
>>> +time_t ustcomm_pid_st_mtime(pid_t pid)
>>> +{
>>> +	struct stat proc_stat;
>>> +	char proc_name[PATH_MAX + 1];
>>> +
>>> +	if (snprintf(proc_name, PATH_MAX, "/proc/%ld", (long)pid)<  0) {
>>> +		return 0;
>>> +	}
>>> +
>>> +	if (stat(proc_name,&proc_stat)) {
>>> +		return 0;
>>> +	}
>>> +
>>> +	return proc_stat.st_mtime;
>>> +}
>>> +
>>> +int ustcomm_is_socket_live(char *sock_name, pid_t *read_pid)
>>> +{
>>> +	time_t time_from_pid;
>>> +	unsigned long time_from_sock;
>>> +	pid_t pid;
>>> +
>>> +	if (time_and_pid_from_socket_name(sock_name,&time_from_sock,&pid)) {
>>> +		return 0;
>>> +	}
>>> +
>>> +	if (read_pid) {
>>> +		*read_pid = pid;
>>> +	}
>>> +
>>> +	time_from_pid = ustcomm_pid_st_mtime(pid);
>>> +	if (!time_from_pid) {
>>> +		return 0;
>>> +	}
>>> +
>>> +	if (((unsigned long)time_from_pid) == time_from_sock) {
>>
>> Please use: (unsigned long) time_from_pid  (with the space) (typical in
>> kernel coding style, which we try to follow).
>
> Done, though as usual this isn't consistent with already existing code.

Yeah.. I'll probably do a cleanup at some point this summer.

>>
>>> +		return 1;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int ustcomm_get_sock_name(char *dir_name, pid_t pid, char *sock_name)
>>> +{
>>> +	struct dirent *dirent;
>>> +	char sock_path_base[101];
>>> +	int len;
>>> +	DIR *dir = opendir(dir_name);
>>> +
>>> +	snprintf(sock_path_base, 100, "%ld.", (long)pid);
>>> +	len = strlen(sock_path_base);
>>> +
>>> +	while ((dirent = readdir(dir))) {
>>> +		if (!strcmp(dirent->d_name, ".") ||
>>> +		    !strcmp(dirent->d_name, "..") ||
>>> +		    !strcmp(dirent->d_name, "ust-consumer") ||
>>> +		    dirent->d_type == DT_DIR ||
>>> +		    strncmp(dirent->d_name, sock_path_base, len)) {
>>> +			continue;
>>> +		}
>>> +
>>> +		if (ustcomm_is_socket_live(dirent->d_name, NULL)) {
>>> +			printf("found %s\n", dirent->d_name);
>>
>> Should this message be only printed in verbose mode ?
>
> Darn, debug! :-) Will fix the rest.

Thanks!

Mathieu

>
> /Nils
>

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com




More information about the lttng-dev mailing list