[lttng-dev] Is it possible to disable recording the CPU id for USTs?
Kienan Stewart
kstewart at efficios.com
Tue Jun 18 11:30:28 EDT 2024
Hi Aditya,
I want to add a further detail.
While neither arm32 nor arm64 have support for quick lookups of the
current cpu id in vDSO, I think that if you are using a recent linux
kernel (>= 4.18) with a recent glibc (>= 2.32 with RSEQ_SIG defined)
sched_getcpu should use RSEQ instead, and it much faster than getcpu
syscall.
C.f.
https://github.com/bminor/glibc/commit/6e29cb3f61ff5432c78a1c84b0d9b123a350ab36
thanks,
kienan
On 6/18/24 10:00 AM, Kienan Stewart via lttng-dev wrote:
> Hi Aditya,
>
> On 6/18/24 8:55 AM, Aditya Kurdunkar via lttng-dev wrote:
>> Hello,
>>
>> Please bear with me if this is a naive question. I am working on an
>> embedded ARM chip (1GB ram, 2CPUs) where I want to collect trace
>> events for a long duration of time. From the research that I have done
>> (mostly reading papers on LTTng tracing, conference talks and
>> documentation) I have seen it mentioned that for ARM the overhead is
>> greater because the system call to get the CPU is quite slow. In my
>> use case I am okay with not having this information. The current
>> benchmarks show a 3 microsecond overhead of a single tracepoint on ARM
>> in comparison to
>
> Is there a specific detail that leads you to believe that getcpu is
> taking the bulk of the time?
>
> Are the performance of your embedded arm chip and the x86_64 system you
> are comparing at all similar otherwise?
>
> Regardless I think this comparison may misleading. It sounds like want
> you want to measure is the time + resources required to run your
> application with and without tracing on the same platform, rather than
> comparing two dissimilar platforms?
>
> Please note that the UST overhead (e.g. spawn an application, launch the
> UST thread, connect to the sessiond, transfer configuration and buffer
> pointers) is comparatively large for a single event, rather than over
> the course of a 'long' running application.
>
> The default behaviour is to block main program execution until the
> registration completes or times out. In many cases you may want to
> disable that timeout for quicker startup at the cost of potentially
> losing event(s) right at the beginning. C.f. LTTNG_UST_REGISTER_TIMEOUT
> in https://lttng.org/man/3/lttng-ust/v2.13/#doc-_environment_variables
>
>> 150ns on a x86 machine. Hence, my question is: Is it possible to
>> disable recording the CPU somehow? Any suggestions for decreasing the
>> overhead other than this are welcome.
>
> It is always enabled c.f.
> https://lttng.org/man/3/lttng-ust/v2.13/#doc-_context_information
>
> However, I suppose you could try to use a custom getcpu plugin, e.g.
> https://github.com/lttng/lttng-ust/tree/master/doc/examples/getcpu-override to return a dummy value.
>
> If you detailed your tracing and benchmark setup it might be possible to
> provide additional guidance.
>
>>
>> Regards,
>> Aditya
>>
>> _______________________________________________
>> lttng-dev mailing list
>> lttng-dev at lists.lttng.org
>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
> thanks,
> kienan
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
More information about the lttng-dev
mailing list