[lttng-dev] [MODULES RFC PATCH] Extract the bitmask of FDs set in select syscall
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Fri Oct 2 20:04:45 EDT 2015
----- On Oct 2, 2015, at 7:58 PM, Julien Desfossez jdesfossez at efficios.com wrote:
> On 02-Oct-2015 11:49:04 PM, Mathieu Desnoyers wrote:
>> ----- On Oct 2, 2015, at 7:01 PM, Julien Desfossez jdesfossez at efficios.com
>> wrote:
>>
>> > Instead of extracting the user-space pointers of the 3 fd_set, we now
>> > extract the bitmask of the FDs in the sets (in, out, ex) in the form of
>> > an array of unsigned long (1024 FDs is the limit in the kernel).
>> >
>> > In this example, we select in input FDs 3, 5 and 10 (0x428), it returns
>> > that one FD is ready: FD 10 (0x400).
>> >
>> > syscall_entry_select: { n = 11,
>> > _fdset_in_length = 1, fdset_in = [ [0] = 0x428 ],
>> > _fdset_out_length = 1, fdset_out = [ [0] = 0x0 ],
>> > _fdset_ex_length = 0, fdset_ex = [ ],
>> > tvp = 0
>> > }
>> > syscall_exit_select: { ret = 1,
>> > _fdset_in_length = 1, fdset_in = [ [0] = 0x400 ],
>> > _fdset_out_length = 1, fdset_out = [ [0] = 0x0 ],
>> > _fdset_ex_length = 0, fdset_ex = [ ],
>> > tvp = 0
>> > }
>> >
>> > Signed-off-by: Julien Desfossez <jdesfossez at efficios.com>
>> > ---
>> > .../x86-64-syscalls-3.10.0-rc7_pointers_override.h | 56 ++++++++++++++++++++++
>> > 1 file changed, 56 insertions(+)
>> >
>> > diff --git
>> > a/instrumentation/syscalls/headers/x86-64-syscalls-3.10.0-rc7_pointers_override.h
>> > b/instrumentation/syscalls/headers/x86-64-syscalls-3.10.0-rc7_pointers_override.h
>> > index 702cfb5..23ffbdd 100644
>> > ---
>> > a/instrumentation/syscalls/headers/x86-64-syscalls-3.10.0-rc7_pointers_override.h
>> > +++
>> > b/instrumentation/syscalls/headers/x86-64-syscalls-3.10.0-rc7_pointers_override.h
>> > @@ -115,6 +115,62 @@ SC_LTTNG_TRACEPOINT_EVENT(pipe,
>> > )
>> > )
>> >
>> > +#define OVERRIDE_64_select
>> > +SC_LTTNG_TRACEPOINT_EVENT_CODE(select,
>> > + TP_PROTO(sc_exit(long ret,) int n, fd_set __user * inp, fd_set __user * outp,
>> > + fd_set __user * exp, struct timeval * tvp),
>> > + TP_ARGS(sc_exit(ret,) n, inp, outp, exp, tvp),
>> > + TP_locvar(
>> > + unsigned long fds_in[__FD_SETSIZE / (8 * sizeof(long))],
>> > + fds_out[__FD_SETSIZE / (8 * sizeof(long))],
>> > + fds_ex[__FD_SETSIZE / (8 * sizeof(long))];
>>
>> I expect this to break apart on kernels with 4kB stack configuration.
>>
>> How much stack does this use ? We might want to consider
>> temporarily allocating memory for this. This is OK since we know
>> we are in a syscall context.
> Indeed that's the main reason it is a RFC patch.
> __FD_SETSIZE == 1024, so it allocates 3 * 8 * (1024/(8*8)) = 384 bytes.
> Is it too much ?
>
> If you prefer an alloc/free, I'll replace that.
The "free" is not necessarily easy to implement, because we could
need to add a new "TP_code_after()" which has code to be executed
after the data serialization.
384 bytes is not as bad as I initially anticipated, this could be
OK, especially since we are in a syscall context, and therefore
the kernel stack is nearly empty when we are called.
You might want to try it in a few loads on a kernel configured with
4k stacks, and with kernel hacking options to track stack overflow
and usage, just to be on the safe side.
Thanks,
Mathieu
>
> Thanks,
>
> Julien
>
>
>>
>> Thanks,
>>
>> Mathieu
>>
>> > + int nb_in, nb_out, nb_ex;
>> > + ),
>> > + TP_code(
>> > + sc_inout(
>> > + {
>> > + unsigned long nr;
>> > + int ret;
>> > +
>> > + nr = FDS_BYTES(n);
>> > + tp_locvar->nb_in = 0;
>> > + tp_locvar->nb_out = 0;
>> > + tp_locvar->nb_ex = 0;
>> > + if (inp) {
>> > + ret = copy_from_user(tp_locvar->fds_in, inp, nr);
>> > + if (ret < 0)
>> > + goto skip_code;
>> > + tp_locvar->nb_in = nr/sizeof(long);
>> > + }
>> > + if (outp) {
>> > + ret = copy_from_user(tp_locvar->fds_out, outp, nr);
>> > + if (ret < 0)
>> > + goto skip_code;
>> > + tp_locvar->nb_out = nr/sizeof(long);
>> > + }
>> > + if (exp) {
>> > + ret = copy_from_user(tp_locvar->fds_ex, exp, nr);
>> > + if (ret < 0)
>> > + goto skip_code;
>> > + tp_locvar->nb_ex = nr/sizeof(long);
>> > + }
>> > + }
>> > + skip_code:
>> > + )
>> > + ),
>> > + TP_FIELDS(
>> > + sc_exit(ctf_integer(long, ret, ret))
>> > + sc_in(ctf_integer(int, n, n))
>> > + sc_inout(ctf_sequence_hex(unsigned long, fdset_in,
>> > + &tp_locvar->fds_in, unsigned long, tp_locvar->nb_in))
>> > + sc_inout(ctf_sequence_hex(unsigned long, fdset_out,
>> > + &tp_locvar->fds_out, unsigned long, tp_locvar->nb_out))
>> > + sc_inout(ctf_sequence_hex(unsigned long, fdset_ex,
>> > + &tp_locvar->fds_ex, unsigned long, tp_locvar->nb_ex))
>> > + sc_inout(ctf_integer(struct timeval *, tvp, tvp))
>> > + )
>> > +)
>> > +
>> > #else /* CREATE_SYSCALL_TABLE */
>> >
>> > #define OVERRIDE_TABLE_64_clone
>> > --
>> > 1.9.1
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
>>
>> _______________________________________________
>> lttng-dev mailing list
>> lttng-dev at lists.lttng.org
> > http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
More information about the lttng-dev
mailing list