<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hi, </div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Sorry, I could not reply on this thread. We tried to handle these issues in our application code. </div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
One query, what is the write way to delete the lttng logger to stop logging the events(from the code).</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Regards,</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Lakshmi</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Kienan Stewart <kstewart@efficios.com><br>
<b>Sent:</b> 16 February 2024 22:11<br>
<b>To:</b> Lakshmi Deverkonda <laksd@nvidia.com>; lttng-dev@lists.lttng.org <lttng-dev@lists.lttng.org><br>
<b>Subject:</b> Re: [lttng-dev] Crash in application due to watchdog timeout with python3 lttng</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">External email: Use caution opening links or attachments<br>
<br>
<br>
Hi Lakshmi,<br>
<br>
On 2/16/24 09:33, Lakshmi Deverkonda wrote:<br>
> This is how, we have created the logger. So the first logger is for file<br>
> logging where is as the second one is for lttng.<br>
><br>
> self.logger = logging.getLogger('cd')<br>
> self.lttng_logger = logging.getLogger('cd-lttng')<br>
><br>
> It seems like at the instant exactly when lttng is logging some data on<br>
> a particular thread and the same instant we receive SIGTERM for the<br>
> application,<br>
> we are unable to join that particular thread. Can you please help.<br>
><br>
<br>
This doesn't constitute a usable reproducer for us. You are also<br>
omitting information on the setup and usage of lttng on your system.<br>
<br>
I get the impression you are not in a position to share your code.<br>
EfficiOS offers support contracts with NDAs that could allow us to work<br>
with you to analyze and improve your use of LTTng. For more info, please<br>
feel free to contact sales@efficios.com.<br>
<br>
> Also we see that performance of lttng is not that good for python3. My<br>
> application has around 24 threads and when logging is enabled for each<br>
> of the threads,<br>
> there is a delay upto 24s for processing the external events.<br>
> Please suggest how to proceed further on these issues.<br>
<br>
Could you describe what you mean by 'processing external events'?<br>
<br>
Which system(s) are involved in processing the events?<br>
<br>
Are the 'external events' the events emitted by invoking<br>
`self.lttng_logger.info('...')`, for example?<br>
<br>
What versions of lttng-tools, lttng-ust, urcu, babeltrace, and python3<br>
are you using? Are you using a relay-daemon at any point?<br>
<br>
How are your lttng sessions configured? Eg. memory allocation, blocking<br>
settings, behaviour on full buffers, etc. The commands you use to create<br>
the session, enable the channels, and activate the events would be great<br>
information to have.<br>
<br>
While performing the logging is the system under heavy load from other<br>
sources? What resources on the system face the most contention (CPU, IO,<br>
memory, ...)?<br>
<br>
We'd be more than happy to analyze the performance of python-lttngust<br>
and work to make improvements so it can meet your needs under a<br>
development contract. For more information, please reach out to<br>
sales@efficios.com.<br>
<br>
><br>
> Regards,<br>
> Lakshmi<br>
<br>
I have taken the time to invent a fictitious example based on the few<br>
details you have given:<br>
<a href="https://gist.github.com/kienanstewart/879bd3bf19d852653b70a3c42caef361">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fkienanstewart%2F879bd3bf19d852653b70a3c42caef361&data=05%7C02%7Claksd%40nvidia.com%7C41cb3fd667944d89cab708dc2f0e2a78%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638436985100994598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=W%2Fb94D39McorSr2ztImThxlyb%2FsFcAgTrnb22Ymo50Q%3D&reserved=0</a><br>
which spawns a number of python threads using the threading module.<br>
<br>
I am using lttng-tools master, lttng-ust master at<br>
47bc09f338f3c1199a878f77b7b18be8d2a224f6, urcu master at<br>
81270292c23ff28aba1abd9a65f0624b657de82b, and babeltrace2 master at<br>
b93af5a2d22e36cf547da1739d60e19791daccbd. My system is running Debian<br>
sid with python 3.11.8.<br>
<br>
To set up a recording session I do the following:<br>
<br>
```<br>
lttng create<br>
lttng enable-event --python 'tp'<br>
lttng start<br>
```<br>
<br>
To run the application, I do the following:<br>
<br>
```<br>
time python3 ./main.py<br>
```<br>
<br>
To quit the application, I send sigterm using the following command<br>
<br>
```<br>
killall $(pgrep -f 'python3 ./main.py')<br>
```<br>
<br>
After the application terminates, I stop the session and view the events<br>
<br>
```<br>
lttng stop<br>
lttng view<br>
lttng view | wc -l<br>
```<br>
<br>
In a 25s run of the application on my 4-thread laptop, I recorded<br>
1010748 events.<br>
<br>
thanks,<br>
kienan<br>
<br>
><br>
> ------------------------------------------------------------------------<br>
> *From:* Lakshmi Deverkonda <laksd@nvidia.com><br>
> *Sent:* 13 February 2024 21:05<br>
> *To:* Kienan Stewart <kstewart@efficios.com>; lttng-dev@lists.lttng.org<br>
> <lttng-dev@lists.lttng.org><br>
> *Subject:* Re: [lttng-dev] Crash in application due to watchdog timeout<br>
> with python3 lttng<br>
> Yes. We are trying to join only the threads related to the application.<br>
> The timeout is happening while trying to join the threads started by the<br>
> application.<br>
><br>
> Regards,<br>
> Lakshmi<br>
> ------------------------------------------------------------------------<br>
> *From:* Kienan Stewart <kstewart@efficios.com><br>
> *Sent:* 13 February 2024 20:50<br>
> *To:* Lakshmi Deverkonda <laksd@nvidia.com>; lttng-dev@lists.lttng.org<br>
> <lttng-dev@lists.lttng.org><br>
> *Subject:* Re: [lttng-dev] Crash in application due to watchdog timeout<br>
> with python3 lttng<br>
> External email: Use caution opening links or attachments<br>
><br>
><br>
> Hi Lakshmi,<br>
><br>
> when the lttngust python agent starts, it attempts to connect to one or<br>
> more session daemons[1].<br>
><br>
> Each connection starts a thread that loops forever, retrying the<br>
> registration in case an exception occurs[2].<br>
><br>
> I don't think the it's designed to have `join()` called on those<br>
> threads, which I assume is happening in some of the code you or your<br>
> team have written.<br>
><br>
> My initial thought is that you should `join()` only the threads that<br>
> pertinent to your application, ignoring the lttngust agent threads and<br>
> then exit the application as normal.<br>
><br>
> [1]:<br>
> <a href="https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L334">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flttng%2Flttng-ust%2Fblob%2F3287f48be61ef3491aff0a80b7185ac57b3d8a5d%2Fsrc%2Fpython-lttngust%2Flttngust%2Fagent.py%23L334&data=05%7C02%7Claksd%40nvidia.com%7C41cb3fd667944d89cab708dc2f0e2a78%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638436985101003000%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=mZjN5F0W%2BHNWslQAULblQHefyQIT5cpSNFylxAY2kAk%3D&reserved=0</a>
<<a href="https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L334">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flttng%2Flttng-ust%2Fblob%2F3287f48be61ef3491aff0a80b7185ac57b3d8a5d%2Fsrc%2Fpython-lttngust%2Flttngust%2Fagent.py%23L334&data=05%7C02%7Claksd%40nvidia.com%7C41cb3fd667944d89cab708dc2f0e2a78%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638436985101009055%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=JypI4VvwxEZzTGG9621hNAvonem63ljMZiXaqvJhIQw%3D&reserved=0</a>><br>
> [2]:<br>
> <a href="https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L83">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flttng%2Flttng-ust%2Fblob%2F3287f48be61ef3491aff0a80b7185ac57b3d8a5d%2Fsrc%2Fpython-lttngust%2Flttngust%2Fagent.py%23L83&data=05%7C02%7Claksd%40nvidia.com%7C41cb3fd667944d89cab708dc2f0e2a78%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638436985101015117%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ZTbmTvR2uY1QoS9qKA6HqwXRKFd3pY%2F39kw2qwjaaVM%3D&reserved=0</a>
<<a href="https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L83">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flttng%2Flttng-ust%2Fblob%2F3287f48be61ef3491aff0a80b7185ac57b3d8a5d%2Fsrc%2Fpython-lttngust%2Flttngust%2Fagent.py%23L83&data=05%7C02%7Claksd%40nvidia.com%7C41cb3fd667944d89cab708dc2f0e2a78%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638436985101021691%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=%2B6JlJyEJVLlfwCMv8eHWwnQwr298L%2Fsk3vGNUuhaAMU%3D&reserved=0</a>><br>
><br>
> thanks,<br>
> kienan<br>
><br>
> On 2/13/24 09:23, Lakshmi Deverkonda via lttng-dev wrote:<br>
>> Hi,<br>
>><br>
>> We are able to integrate python3 lttng module in our application(python3<br>
>> based). However, we are seeing that whenever the application terminates,<br>
>> there is watchdog timeout due to timeout in joining the threads. What<br>
>> could be the reason for this ? Does lttng module hold any thread event<br>
>> locks ?<br>
>> We are completely blocked on this issue. Could you please help ?<br>
>><br>
>> Here is the snippet of the core dump<br>
>><br>
>> (gdb) py-bt<br>
>> Traceback (most recent call first):<br>
>> File "/usr/lib/python3.7/threading.py", line 1048, in<br>
>> _wait_for_tstate_lock<br>
>> elif lock.acquire(block, timeout):<br>
>> File "/usr/lib/python3.7/threading.py", line 1032, in join<br>
>> self._wait_for_tstate_lock()<br>
>> File "/usr/lib/python3/dist-packages/h.py", line 231, in JoinThreads<br>
>> self.TT.join()<br>
>> File "/usr/sbin/c", line 1466, in do_exit<br>
>> H.JoinThreads()<br>
>> File "/usr/sbin/c", line 7201, in main<br>
>> do_exit(nlm, status)<br>
>> File "/usr/sbin/c", line 7233, in <module><br>
>> main()<br>
>> (gdb)<br>
>><br>
>> On a parallel note, thanks to Kienan who has been trying to provide<br>
>> pointers on various issues reported so far.<br>
>><br>
>> Need help on this issue as well.<br>
>> Thanks in advance,<br>
>><br>
>> Regards,<br>
>> Lakshmi<br>
>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> lttng-dev mailing list<br>
>> lttng-dev@lists.lttng.org<br>
>> <a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.lttng.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Flttng-dev&data=05%7C02%7Claksd%40nvidia.com%7C41cb3fd667944d89cab708dc2f0e2a78%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638436985101026830%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=mB2APfDsn7BObrStEEWtfOHkxa3zC8LOnk%2FsuKdL9%2F0%3D&reserved=0</a>
<<a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.lttng.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Flttng-dev&data=05%7C02%7Claksd%40nvidia.com%7C41cb3fd667944d89cab708dc2f0e2a78%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638436985101031212%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=LEnpdupFl2tg8vh0cV3YlRpSBXAnfaW9oW40eVXzBfI%3D&reserved=0</a>><br>
</div>
</span></font></div>
</body>
</html>