<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Yes. We are trying to join only the threads related to the application. The timeout is happening while trying to join the threads started by the application.   </div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Regards,</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Lakshmi</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Kienan Stewart <kstewart@efficios.com><br>
<b>Sent:</b> 13 February 2024 20:50<br>
<b>To:</b> Lakshmi Deverkonda <laksd@nvidia.com>; lttng-dev@lists.lttng.org <lttng-dev@lists.lttng.org><br>
<b>Subject:</b> Re: [lttng-dev] Crash in application due to watchdog timeout with python3 lttng</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">External email: Use caution opening links or attachments<br>
<br>
<br>
Hi Lakshmi,<br>
<br>
when the lttngust python agent starts, it attempts to connect to one or<br>
more session daemons[1].<br>
<br>
Each connection starts a thread that loops forever, retrying the<br>
registration in case an exception occurs[2].<br>
<br>
I don't think the it's designed to have `join()` called on those<br>
threads, which I assume is happening in some of the code you or your<br>
team have written.<br>
<br>
My initial thought is that you should `join()` only the threads that<br>
pertinent to your application, ignoring the lttngust agent threads and<br>
then exit the application as normal.<br>
<br>
[1]:<br>
<a href="https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L334">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flttng%2Flttng-ust%2Fblob%2F3287f48be61ef3491aff0a80b7185ac57b3d8a5d%2Fsrc%2Fpython-lttngust%2Flttngust%2Fagent.py%23L334&data=05%7C02%7Claksd%40nvidia.com%7Cbdf064d348474249f14a08dc2ca755c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638434344447867621%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=7tmpFtjl7RkTVgYLr2YjdlPs2oM1F%2FXOg6W51mHDCws%3D&reserved=0</a><br>
[2]:<br>
<a href="https://github.com/lttng/lttng-ust/blob/3287f48be61ef3491aff0a80b7185ac57b3d8a5d/src/python-lttngust/lttngust/agent.py#L83">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flttng%2Flttng-ust%2Fblob%2F3287f48be61ef3491aff0a80b7185ac57b3d8a5d%2Fsrc%2Fpython-lttngust%2Flttngust%2Fagent.py%23L83&data=05%7C02%7Claksd%40nvidia.com%7Cbdf064d348474249f14a08dc2ca755c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638434344447874777%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=PKb8rKDWFKmuuVB4YQEL8ZtAP%2B%2BYfTniUuLN9fFBctc%3D&reserved=0</a><br>
<br>
thanks,<br>
kienan<br>
<br>
On 2/13/24 09:23, Lakshmi Deverkonda via lttng-dev wrote:<br>
> Hi,<br>
><br>
> We are able to integrate python3 lttng module in our application(python3<br>
> based). However, we are seeing that whenever the application terminates,<br>
> there is watchdog timeout due to timeout in joining the threads. What<br>
> could be the reason for this ? Does lttng module hold any thread event<br>
> locks ?<br>
> We are completely blocked on this issue. Could you please help ?<br>
><br>
> Here is the snippet of the core dump<br>
><br>
> (gdb) py-bt<br>
> Traceback (most recent call first):<br>
>    File "/usr/lib/python3.7/threading.py", line 1048, in<br>
> _wait_for_tstate_lock<br>
>      elif lock.acquire(block, timeout):<br>
>    File "/usr/lib/python3.7/threading.py", line 1032, in join<br>
>      self._wait_for_tstate_lock()<br>
>    File "/usr/lib/python3/dist-packages/h.py", line 231, in JoinThreads<br>
>      self.TT.join()<br>
>    File "/usr/sbin/c", line 1466, in do_exit<br>
>      H.JoinThreads()<br>
>    File "/usr/sbin/c", line 7201, in main<br>
>      do_exit(nlm, status)<br>
>    File "/usr/sbin/c", line 7233, in <module><br>
>      main()<br>
> (gdb)<br>
><br>
> On a parallel note, thanks to Kienan who has been trying to provide<br>
> pointers on various issues reported so far.<br>
><br>
> Need help on this issue as well.<br>
> Thanks in advance,<br>
><br>
> Regards,<br>
> Lakshmi<br>
><br>
><br>
><br>
> _______________________________________________<br>
> lttng-dev mailing list<br>
> lttng-dev@lists.lttng.org<br>
> <a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev">https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.lttng.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Flttng-dev&data=05%7C02%7Claksd%40nvidia.com%7Cbdf064d348474249f14a08dc2ca755c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638434344447880631%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=2iVi8xLrTS1Dj%2FcF3V30q0OjCvMP4kTpOUSthJvnZI0%3D&reserved=0</a><br>
</div>
</span></font></div>
</body>
</html>