[ltt-dev] [UST PATCH] remove duplicate return
Pierre-Marc Fournier
pierre-marc.fournier at polymtl.ca
Thu Sep 9 03:10:27 EDT 2010
On 09/07/2010 11:43 AM, Mathieu Desnoyers wrote:
> * Pierre-Marc Fournier (pierre-marc.fournier at polymtl.ca) wrote:
>> On 09/06/2010 11:29 AM, Mathieu Desnoyers wrote:
>>> * Mathieu Desnoyers (compudj at krystal.dyndns.org) wrote:
>>>> * Pierre-Marc Fournier (pierre-marc.fournier at polymtl.ca) wrote:
>>>>> I disagree with you Mathieu. These retvals are the same as i/o
>>>>> syscalls (read/write/send/recv/...)and therefore should in my opinion
>>>>> remain as is.
>>>>
>>>> Well, this function is not technically the same as i/o syscalls at all.
>>>> It uses I/O syscalls, but it is not an I/O syscall per se, so the return
>>>> value transformation to a more standard pattern (neg err val, 0 ok)
>>>> should happen right in this function rather than to let all callers
>>>> handle this. I/O syscalls use positive return values to indicate the
>>>> number of bytes read/written/etc. Here, this function arbitrarily choose
>>>> 1 to indicate that "something has been sent" without caring about the
>>>> amount of data moved at all.
>>>>
>>>> So as it doesn't need the whole positive range to spell out the amount
>>>> of data moved, it doesn't need to do the same special-cases that the I/O
>>>> syscalls are doing. It adds a lot of error values management oddness
>>>> without adding anything.
>>>>
>>>> So even though I agree with you that this function is close to the I/O
>>>> system calls because it calls it, it is very far from the I/O syscalls
>>>> semantically (we don't care about the number of bytes written), and even
>>>> though we might be tempted to use the same error values as system calls
>>>> for them, the fact that we just don't care about the number of bytes
>>>> written combined with the fact that standardizing error value across the
>>>> code makes it much easier to follow and to write just call for this
>>>> change.
>>>
>>> By the way, looking at include/share.h:patient_write(), in the case
>>> where write returns 0, I think we should consider this as a success and
>>> loop again to retry write rather than consider that an error occurred.
>>> The same apply to patient_send(). See the manpages for details:
>>>
>>> write(2):
>>>
>>> RETURN VALUE
>>> On success, the number of bytes written is returned (zero indicates
>>> nothing was written). On error, -1 is returned, and errno is set
>>> appropriately.
>>>
>>> If count is zero and fd refers to a regular file, then write() may
>>> return a failure status if one of the errors below is detected. If no
>>> errors are detected, 0 will be returned without causing any other
>>> effect. If count is zero and fd refers to a file other than a regular
>>> file, the results are not specified.
>>>
>>> send(2):
>>> RETURN VALUE
>>> On success, these calls return the number of characters sent. On
>>> error, -1 is returned, and errno is set appropriately.
>>>
>>
>> No. 0 indicates end of stream
>
> By end of stream, you mean this sequence ?
>
> server:
> (block SIGPIPE)
> sockwrite = socket(AF_UNIX, SOCK_STREAM, 0);
> addr.sun_family = AF_UNIX;
> strcpy(addr.sun_path, "./mysocket");
>
> /* wait for client to create the unix socket */
> sleep(2);
>
> connect(sockwrite, (struct sockaddr *)&addr, sizeof(addr));)
>
> /* let's do one write which will be consumed by the client */
> write(sockwrite, "blah", sizeof("blah"));
>
> /* wait for client to close FD */
> sleep(2);
>
> ret = write(sockwrite, "blah", sizeof("blah"));
> ( or even with ret = send(sockwrite, "blah", sizeof("blah"), MSG_NOSIGNAL); )
> printf("write ret %ld errno %ld\n", ret, errno);
>
>
> client:
> (block SIGPIPE)
> sockserv = socket(AF_UNIX, SOCK_STREAM, 0);
> addr.sun_family = AF_UNIX;
> strcpy(addr.sun_path, "./mysocket");
> bind(sockserv, (struct sockaddr *)&addr, sizeof(addr));
> listen(sockserv, 1);
> sockread = accept(sockserv, NULL, NULL);
>
> recv(sockread, buf, sizeof(buf), 0);
>
> close(sockread);
>
> The server printf returns:
>
> "write ret -1 errno 32"
>
> So the return value is -1, with an error number: EPIPE: Broken pipe
>
>> and looping on it will result in an
>> infinite loop. You need to refer to the specific backend driver to
>> understand these specific semantics. The read/write manpages are
>> notorious for their non-clarity about this.
>
> This is what I just did by creating my small test program. Unless I am
> misunderstanding your definition of "end of stream", your assumptions
> about the way the unix sockets work seem incorrect.
Well, just that one assumption. But I accept your explanation and my
foolishness. ;-) patient_write() and patient_send() need to be fixed as
you describe.
>
> Note that as the manpage says, if the file descriptor refers to a
> regular file, then write() can return 0, with an error number set, which
> could indicate that an error occured. If the file descriptor refers to a
> file other than a regular file, the results are unspecified. So I think
> the sane way to handle this would be to check if errno is 0 when write
> returns 0. If errno != 0, then we have an error, but if errno == 0, I'd
> consider that 0 bytes were written and retry.
>
>
> By the way, you forgot to reply to my first message above which covers
> the topic of semantics, so I'm re-pasting it here.
>
> (about ustcomm_send_request())
>
> Well, this function is not technically the same as i/o syscalls at all.
> It uses I/O syscalls, but it is not an I/O syscall per se, so the return
> value transformation to a more standard pattern (neg err val, 0 ok)
> should happen right in this function rather than to let all callers
> handle this. I/O syscalls use positive return values to indicate the
> number of bytes read/written/etc. Here, this function arbitrarily choose
> 1 to indicate that "something has been sent" without caring about the
> amount of data moved at all.
Let me copy the description of the return values here.
/*
* Return value:
* 0: Success, but no reply because recv() returned 0
* 1: Success
* -1: Error
*
* On error, the error message is printed, except on
* ECONNRESET, which is normal when the application dies.
*/
>
> So as it doesn't need the whole positive range to spell out the amount
> of data moved, it doesn't need to do the same special-cases that the I/O
> syscalls are doing. It adds a lot of error values management oddness
> without adding anything.
There are 3 cases the caller must consider when calling ustcomm_send_msg().
- An error occurred, there is a problem and it is necessary to complain (-1)
- The remote connection was closed so we were not able to send the
message (0)
- Everything worked fine (1)
Callers need to know if the connection was closed in order not to
complain but at the same time take consequent measures.
>
> So even though I agree with you that this function is close to the I/O
> system calls because it calls it, it is very far from the I/O syscalls
> semantically (we don't care about the number of bytes written), and even
> though we might be tempted to use the same error values as system calls
> for them, the fact that we just don't care about the number of bytes
> written combined with the fact that standardizing error value across the
> code makes it much easier to follow and to write just call for this
> change.
If you want to propose different semantics that still allow the caller
to know which of the 3 cases happened, I'm all open.
Thanks
pmf
More information about the lttng-dev
mailing list