[lttng-dev] Is it possible to disable recording the CPU id for USTs?

Kienan Stewart kstewart at efficios.com
Tue Jun 18 11:30:28 EDT 2024


Hi Aditya,

I want to add a further detail.

While neither arm32 nor arm64 have support for quick lookups of the 
current cpu id in vDSO, I think that if you are using a recent linux 
kernel (>= 4.18) with a recent glibc (>= 2.32 with RSEQ_SIG defined) 
sched_getcpu should use RSEQ instead, and it much faster than getcpu 
syscall.

C.f. 
https://github.com/bminor/glibc/commit/6e29cb3f61ff5432c78a1c84b0d9b123a350ab36

thanks,
kienan

On 6/18/24 10:00 AM, Kienan Stewart via lttng-dev wrote:
> Hi Aditya,
> 
> On 6/18/24 8:55 AM, Aditya Kurdunkar via lttng-dev wrote:
>> Hello,
>>
>> Please bear with me if this is a naive question. I am working on an 
>> embedded ARM chip (1GB ram, 2CPUs) where I want to collect trace 
>> events for a long duration of time. From the research that I have done 
>> (mostly reading papers on LTTng tracing, conference talks and 
>> documentation) I have seen it mentioned that for ARM the overhead is 
>> greater because the system call to get the CPU is quite slow. In my 
>> use case I am okay with not having  this information. The current 
>> benchmarks show a 3 microsecond overhead of a single tracepoint on ARM 
>> in comparison to 
> 
> Is there a specific detail that leads you to believe that getcpu is 
> taking the bulk of the time?
> 
> Are the performance of your embedded arm chip and the x86_64 system you 
> are comparing at all similar otherwise?
> 
> Regardless I think this comparison may misleading. It sounds like want 
> you want to measure is the time + resources required to run your 
> application with and without tracing on the same platform, rather than 
> comparing two dissimilar platforms?
> 
> Please note that the UST overhead (e.g. spawn an application, launch the 
> UST thread, connect to the sessiond, transfer configuration and buffer 
> pointers) is comparatively large for a single event, rather than over 
> the course of a 'long' running application.
> 
> The default behaviour is to block main program execution until the 
> registration completes or times out. In many cases you may want to 
> disable that timeout for quicker startup at the cost of potentially 
> losing event(s) right at the beginning. C.f. LTTNG_UST_REGISTER_TIMEOUT 
> in https://lttng.org/man/3/lttng-ust/v2.13/#doc-_environment_variables
> 
>> 150ns on a x86 machine. Hence, my question is: Is it possible to 
>> disable recording the CPU somehow? Any suggestions for decreasing the 
>> overhead other than this are welcome.
> 
> It is always enabled c.f. 
> https://lttng.org/man/3/lttng-ust/v2.13/#doc-_context_information
> 
> However, I suppose you could try to use a custom getcpu plugin, e.g. 
> https://github.com/lttng/lttng-ust/tree/master/doc/examples/getcpu-override to return a dummy value.
> 
> If you detailed your tracing and benchmark setup it might be possible to 
> provide additional guidance.
> 
>>
>> Regards,
>> Aditya
>>
>> _______________________________________________
>> lttng-dev mailing list
>> lttng-dev at lists.lttng.org
>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> 
> thanks,
> kienan
> 
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


More information about the lttng-dev mailing list