[ltt-dev] [UST PATCH] remove duplicate return

Thu Sep 9 13:05:48 EDT 2010

On 10-09-09 12:38 PM, Mathieu Desnoyers wrote:
> * Pierre-Marc Fournier (pierre-marc.fournier at polymtl.ca) wrote:
>> On 09/07/2010 11:43 AM, Mathieu Desnoyers wrote:
>>> * Pierre-Marc Fournier (pierre-marc.fournier at polymtl.ca) wrote:
>>>> On 09/06/2010 11:29 AM, Mathieu Desnoyers wrote:
>>>>> * Mathieu Desnoyers (compudj at krystal.dyndns.org) wrote:
>>>>>> * Pierre-Marc Fournier (pierre-marc.fournier at polymtl.ca) wrote:
>>>>>>> I disagree with you Mathieu. These retvals are the same as i/o
>>>>>>> syscalls (read/write/send/recv/...)and therefore should in my opinion
>>>>>>> remain as is.
>>>>>>
>>>>>> Well, this function is not technically the same as i/o syscalls at all.
>>>>>> It uses I/O syscalls, but it is not an I/O syscall per se, so the return
>>>>>> value transformation to a more standard pattern (neg err val, 0 ok)
>>>>>> should happen right in this function rather than to let all callers
>>>>>> handle this. I/O syscalls use positive return values to indicate the
>>>>>> number of bytes read/written/etc. Here, this function arbitrarily choose
>>>>>> 1 to indicate that "something has been sent" without caring about the
>>>>>> amount of data moved at all.
>>>>>>
>>>>>> So as it doesn't need the whole positive range to spell out the amount
>>>>>> of data moved, it doesn't need to do the same special-cases that the I/O
>>>>>> syscalls are doing. It adds a lot of error values management oddness
>>>>>> without adding anything.
>>>>>>
>>>>>> So even though I agree with you that this function is close to the I/O
>>>>>> system calls because it calls it, it is very far from the I/O syscalls
>>>>>> semantically (we don't care about the number of bytes written), and even
>>>>>> though we might be tempted to use the same error values as system calls
>>>>>> for them, the fact that we just don't care about the number of bytes
>>>>>> written combined with the fact that standardizing error value across the
>>>>>> code makes it much easier to follow and to write just call for this
>>>>>> change.
>>>>>
>>>>> By the way, looking at include/share.h:patient_write(), in the case
>>>>> where write returns 0, I think we should consider this as a success and
>>>>> loop again to retry write rather than consider that an error occurred.
>>>>> The same apply to patient_send(). See the manpages for details:
>>>>>
>>>>> write(2):
>>>>>
>>>>> RETURN VALUE
>>>>>           On  success,  the  number  of bytes written is returned (zero indicates
>>>>>           nothing was written).  On error, -1  is  returned,  and  errno  is  set
>>>>>           appropriately.
>>>>>
>>>>>           If  count  is  zero  and  fd refers to a regular file, then write() may
>>>>>           return a failure status if one of the errors below is detected.  If  no
>>>>>           errors  are  detected,  0  will  be  returned without causing any other
>>>>>           effect.  If count is zero and fd refers to a file other than a  regular
>>>>>           file, the results are not specified.
>>>>>
>>>>> send(2):
>>>>> RETURN VALUE
>>>>>           On success, these calls return  the  number  of  characters sent.   On
>>>>>           error, -1 is returned, and errno is set appropriately.
>>>>>
>>>>
>>>> No. 0 indicates end of stream
>>>
>>> By end of stream, you mean this sequence ?
>>>
>>> server:
>>>     (block SIGPIPE)
>>>     sockwrite = socket(AF_UNIX, SOCK_STREAM, 0);
>>>     addr.sun_family = AF_UNIX;
>>>     strcpy(addr.sun_path, "./mysocket");
>>>
>>>     /* wait for client to create the unix socket */
>>>     sleep(2);
>>>
>>>     connect(sockwrite, (struct sockaddr *)&addr, sizeof(addr));)
>>>
>>>     /* let's do one write which will be consumed by the client */
>>>     write(sockwrite, "blah", sizeof("blah"));
>>>
>>>     /* wait for client to close FD */
>>>     sleep(2);
>>>
>>>     ret = write(sockwrite, "blah", sizeof("blah"));
>>>     ( or even with ret = send(sockwrite, "blah", sizeof("blah"), MSG_NOSIGNAL); )
>>>     printf("write ret %ld errno %ld\n", ret, errno);
>>>
>>>
>>> client:
>>>     (block SIGPIPE)
>>>     sockserv = socket(AF_UNIX, SOCK_STREAM, 0);
>>>     addr.sun_family = AF_UNIX;
>>>     strcpy(addr.sun_path, "./mysocket");
>>>     bind(sockserv, (struct sockaddr *)&addr, sizeof(addr));
>>>     listen(sockserv, 1);
>>>     sockread = accept(sockserv, NULL, NULL);
>>>
>>>     recv(sockread, buf, sizeof(buf), 0);
>>>
>>>     close(sockread);
>>>
>>> The server printf returns:
>>>
>>>    "write ret -1 errno 32"
>>>
>>> So the return value is -1, with an error number: EPIPE: Broken pipe
>>>
>>>> and looping on it will result in an
>>>> infinite loop. You need to refer to the specific backend driver to
>>>> understand these specific semantics. The read/write manpages are
>>>> notorious for their non-clarity about this.
>>>
>>> This is what I just did by creating my small test program. Unless I am
>>> misunderstanding your definition of "end of stream", your assumptions
>>> about the way the unix sockets work seem incorrect.
>>
>> Well, just that one assumption. But I accept your explanation and my
>> foolishness. ;-) patient_write() and patient_send() need to be fixed as
>> you describe.
>
> OK, is Nils or David willing to take the ball ?
>

It's a time factor on my part. I don't think this is quite urgent so 
I'll try to make it as soon as I can. I know that Nils is quite busy on 
Trace events so I can do that to offload him.

About that, while doing this, I propose we just pass over UST code to 
correct and standardize the error handling and integrate your 
proposition below systematically. I came across a lot of code in ustctl, 
ustcmd and ustcomm that was not quite following the error "convention" 
of UST. Also, as I mention in an early post, the UST "standard" error 
(usterr.h) are positive value so we might also refactor that.

However, this might modify a lot of code so mistakes or missing part can 
easily be forgotten. It will be very important that a careful review be 
done. I'm willing to do that in a near future but, as I said, time 
factor on my part will delay this patch set.

Thanks
David

>>
>>>
>>> Note that as the manpage says, if the file descriptor refers to a
>>> regular file, then write() can return 0, with an error number set, which
>>> could indicate that an error occured. If the file descriptor refers to a
>>> file other than a regular file, the results are unspecified. So I think
>>> the sane way to handle this would be to check if errno is 0 when write
>>> returns 0. If errno != 0, then we have an error, but if errno == 0, I'd
>>> consider that 0 bytes were written and retry.
>>>
>>>
>>> By the way, you forgot to reply to my first message above which covers
>>> the topic of semantics, so I'm re-pasting it here.
>>>
>>> (about ustcomm_send_request())
>>>
>>> Well, this function is not technically the same as i/o syscalls at all.
>>> It uses I/O syscalls, but it is not an I/O syscall per se, so the return
>>> value transformation to a more standard pattern (neg err val, 0 ok)
>>> should happen right in this function rather than to let all callers
>>> handle this. I/O syscalls use positive return values to indicate the
>>> number of bytes read/written/etc. Here, this function arbitrarily choose
>>> 1 to indicate that "something has been sent" without caring about the
>>> amount of data moved at all.
>>
>> Let me copy the description of the return values here.
>>
>> /*
>>   * Return value:
>>   *   0: Success, but no reply because recv() returned 0
>>   *   1: Success
>>   *   -1: Error
>>   *
>>   * On error, the error message is printed, except on
>>   * ECONNRESET, which is normal when the application dies.
>>   */
>>
>>>
>>> So as it doesn't need the whole positive range to spell out the amount
>>> of data moved, it doesn't need to do the same special-cases that the I/O
>>> syscalls are doing. It adds a lot of error values management oddness
>>> without adding anything.
>>
>> There are 3 cases the caller must consider when calling ustcomm_send_msg().
>> - An error occurred, there is a problem and it is necessary to complain (-1)
>> - The remote connection was closed so we were not able to send the
>> message (0)
>> - Everything worked fine (1)
>>
>> Callers need to know if the connection was closed in order not to
>> complain but at the same time take consequent measures.
>>
>>>
>>> So even though I agree with you that this function is close to the I/O
>>> system calls because it calls it, it is very far from the I/O syscalls
>>> semantically (we don't care about the number of bytes written), and even
>>> though we might be tempted to use the same error values as system calls
>>> for them, the fact that we just don't care about the number of bytes
>>> written combined with the fact that standardizing error value across the
>>> code makes it much easier to follow and to write just call for this
>>> change.
>>
>> If you want to propose different semantics that still allow the caller
>> to know which of the 3 cases happened, I'm all open.
>
> Yep, this is what I am proposing:
>
> return 0: everything is OK
> return -EPIPE: remote connection was closed, so we were unable to send
>                 the message.
> return -EWHATEVER (other than -EPIPE): An error occured.
>
> So the caller can deal with -EPIPE as it wants.
>
> How does that sound ?
>
> Thanks,
>
> Mathieu
>
>>
>> Thanks
>>
>> pmf
>>
>

-- 
David Goulet
LTTng project, DORSAL Lab.

PGP/GPG : 1024D/16BD8563
BE3C 672B 9331 9796 291A  14C6 4AF7 C14B 16BD 8563