[ltt-dev] [UST PATCH] remove duplicate return
David Goulet
david.goulet at polymtl.ca
Thu Sep 9 13:05:48 EDT 2010
On 10-09-09 12:38 PM, Mathieu Desnoyers wrote:
> * Pierre-Marc Fournier (pierre-marc.fournier at polymtl.ca) wrote:
>> On 09/07/2010 11:43 AM, Mathieu Desnoyers wrote:
>>> * Pierre-Marc Fournier (pierre-marc.fournier at polymtl.ca) wrote:
>>>> On 09/06/2010 11:29 AM, Mathieu Desnoyers wrote:
>>>>> * Mathieu Desnoyers (compudj at krystal.dyndns.org) wrote:
>>>>>> * Pierre-Marc Fournier (pierre-marc.fournier at polymtl.ca) wrote:
>>>>>>> I disagree with you Mathieu. These retvals are the same as i/o
>>>>>>> syscalls (read/write/send/recv/...)and therefore should in my opinion
>>>>>>> remain as is.
>>>>>>
>>>>>> Well, this function is not technically the same as i/o syscalls at all.
>>>>>> It uses I/O syscalls, but it is not an I/O syscall per se, so the return
>>>>>> value transformation to a more standard pattern (neg err val, 0 ok)
>>>>>> should happen right in this function rather than to let all callers
>>>>>> handle this. I/O syscalls use positive return values to indicate the
>>>>>> number of bytes read/written/etc. Here, this function arbitrarily choose
>>>>>> 1 to indicate that "something has been sent" without caring about the
>>>>>> amount of data moved at all.
>>>>>>
>>>>>> So as it doesn't need the whole positive range to spell out the amount
>>>>>> of data moved, it doesn't need to do the same special-cases that the I/O
>>>>>> syscalls are doing. It adds a lot of error values management oddness
>>>>>> without adding anything.
>>>>>>
>>>>>> So even though I agree with you that this function is close to the I/O
>>>>>> system calls because it calls it, it is very far from the I/O syscalls
>>>>>> semantically (we don't care about the number of bytes written), and even
>>>>>> though we might be tempted to use the same error values as system calls
>>>>>> for them, the fact that we just don't care about the number of bytes
>>>>>> written combined with the fact that standardizing error value across the
>>>>>> code makes it much easier to follow and to write just call for this
>>>>>> change.
>>>>>
>>>>> By the way, looking at include/share.h:patient_write(), in the case
>>>>> where write returns 0, I think we should consider this as a success and
>>>>> loop again to retry write rather than consider that an error occurred.
>>>>> The same apply to patient_send(). See the manpages for details:
>>>>>
>>>>> write(2):
>>>>>
>>>>> RETURN VALUE
>>>>> On success, the number of bytes written is returned (zero indicates
>>>>> nothing was written). On error, -1 is returned, and errno is set
>>>>> appropriately.
>>>>>
>>>>> If count is zero and fd refers to a regular file, then write() may
>>>>> return a failure status if one of the errors below is detected. If no
>>>>> errors are detected, 0 will be returned without causing any other
>>>>> effect. If count is zero and fd refers to a file other than a regular
>>>>> file, the results are not specified.
>>>>>
>>>>> send(2):
>>>>> RETURN VALUE
>>>>> On success, these calls return the number of characters sent. On
>>>>> error, -1 is returned, and errno is set appropriately.
>>>>>
>>>>
>>>> No. 0 indicates end of stream
>>>
>>> By end of stream, you mean this sequence ?
>>>
>>> server:
>>> (block SIGPIPE)
>>> sockwrite = socket(AF_UNIX, SOCK_STREAM, 0);
>>> addr.sun_family = AF_UNIX;
>>> strcpy(addr.sun_path, "./mysocket");
>>>
>>> /* wait for client to create the unix socket */
>>> sleep(2);
>>>
>>> connect(sockwrite, (struct sockaddr *)&addr, sizeof(addr));)
>>>
>>> /* let's do one write which will be consumed by the client */
>>> write(sockwrite, "blah", sizeof("blah"));
>>>
>>> /* wait for client to close FD */
>>> sleep(2);
>>>
>>> ret = write(sockwrite, "blah", sizeof("blah"));
>>> ( or even with ret = send(sockwrite, "blah", sizeof("blah"), MSG_NOSIGNAL); )
>>> printf("write ret %ld errno %ld\n", ret, errno);
>>>
>>>
>>> client:
>>> (block SIGPIPE)
>>> sockserv = socket(AF_UNIX, SOCK_STREAM, 0);
>>> addr.sun_family = AF_UNIX;
>>> strcpy(addr.sun_path, "./mysocket");
>>> bind(sockserv, (struct sockaddr *)&addr, sizeof(addr));
>>> listen(sockserv, 1);
>>> sockread = accept(sockserv, NULL, NULL);
>>>
>>> recv(sockread, buf, sizeof(buf), 0);
>>>
>>> close(sockread);
>>>
>>> The server printf returns:
>>>
>>> "write ret -1 errno 32"
>>>
>>> So the return value is -1, with an error number: EPIPE: Broken pipe
>>>
>>>> and looping on it will result in an
>>>> infinite loop. You need to refer to the specific backend driver to
>>>> understand these specific semantics. The read/write manpages are
>>>> notorious for their non-clarity about this.
>>>
>>> This is what I just did by creating my small test program. Unless I am
>>> misunderstanding your definition of "end of stream", your assumptions
>>> about the way the unix sockets work seem incorrect.
>>
>> Well, just that one assumption. But I accept your explanation and my
>> foolishness. ;-) patient_write() and patient_send() need to be fixed as
>> you describe.
>
> OK, is Nils or David willing to take the ball ?
>
It's a time factor on my part. I don't think this is quite urgent so
I'll try to make it as soon as I can. I know that Nils is quite busy on
Trace events so I can do that to offload him.
About that, while doing this, I propose we just pass over UST code to
correct and standardize the error handling and integrate your
proposition below systematically. I came across a lot of code in ustctl,
ustcmd and ustcomm that was not quite following the error "convention"
of UST. Also, as I mention in an early post, the UST "standard" error
(usterr.h) are positive value so we might also refactor that.
However, this might modify a lot of code so mistakes or missing part can
easily be forgotten. It will be very important that a careful review be
done. I'm willing to do that in a near future but, as I said, time
factor on my part will delay this patch set.
Thanks
David
>>
>>>
>>> Note that as the manpage says, if the file descriptor refers to a
>>> regular file, then write() can return 0, with an error number set, which
>>> could indicate that an error occured. If the file descriptor refers to a
>>> file other than a regular file, the results are unspecified. So I think
>>> the sane way to handle this would be to check if errno is 0 when write
>>> returns 0. If errno != 0, then we have an error, but if errno == 0, I'd
>>> consider that 0 bytes were written and retry.
>>>
>>>
>>> By the way, you forgot to reply to my first message above which covers
>>> the topic of semantics, so I'm re-pasting it here.
>>>
>>> (about ustcomm_send_request())
>>>
>>> Well, this function is not technically the same as i/o syscalls at all.
>>> It uses I/O syscalls, but it is not an I/O syscall per se, so the return
>>> value transformation to a more standard pattern (neg err val, 0 ok)
>>> should happen right in this function rather than to let all callers
>>> handle this. I/O syscalls use positive return values to indicate the
>>> number of bytes read/written/etc. Here, this function arbitrarily choose
>>> 1 to indicate that "something has been sent" without caring about the
>>> amount of data moved at all.
>>
>> Let me copy the description of the return values here.
>>
>> /*
>> * Return value:
>> * 0: Success, but no reply because recv() returned 0
>> * 1: Success
>> * -1: Error
>> *
>> * On error, the error message is printed, except on
>> * ECONNRESET, which is normal when the application dies.
>> */
>>
>>>
>>> So as it doesn't need the whole positive range to spell out the amount
>>> of data moved, it doesn't need to do the same special-cases that the I/O
>>> syscalls are doing. It adds a lot of error values management oddness
>>> without adding anything.
>>
>> There are 3 cases the caller must consider when calling ustcomm_send_msg().
>> - An error occurred, there is a problem and it is necessary to complain (-1)
>> - The remote connection was closed so we were not able to send the
>> message (0)
>> - Everything worked fine (1)
>>
>> Callers need to know if the connection was closed in order not to
>> complain but at the same time take consequent measures.
>>
>>>
>>> So even though I agree with you that this function is close to the I/O
>>> system calls because it calls it, it is very far from the I/O syscalls
>>> semantically (we don't care about the number of bytes written), and even
>>> though we might be tempted to use the same error values as system calls
>>> for them, the fact that we just don't care about the number of bytes
>>> written combined with the fact that standardizing error value across the
>>> code makes it much easier to follow and to write just call for this
>>> change.
>>
>> If you want to propose different semantics that still allow the caller
>> to know which of the 3 cases happened, I'm all open.
>
> Yep, this is what I am proposing:
>
> return 0: everything is OK
> return -EPIPE: remote connection was closed, so we were unable to send
> the message.
> return -EWHATEVER (other than -EPIPE): An error occured.
>
> So the caller can deal with -EPIPE as it wants.
>
> How does that sound ?
>
> Thanks,
>
> Mathieu
>
>>
>> Thanks
>>
>> pmf
>>
>
--
David Goulet
LTTng project, DORSAL Lab.
PGP/GPG : 1024D/16BD8563
BE3C 672B 9331 9796 291A 14C6 4AF7 C14B 16BD 8563
More information about the lttng-dev
mailing list