[ltt-dev] problems with ust
Sylvain Geneves
sylvain.geneves at inrialpes.fr
Wed Apr 14 05:26:49 EDT 2010
On 04/14/2010 02:33 AM, Pierre-Marc Fournier wrote:
>>> Sylvain Geneves wrote:
>>>>
>>>> I've pulled UST from git yesterday, and noticed some errors, and some
>>>> are confusing me...
>>>>
>>>> first of all, i have a problem for compiling my program (when i include
>>>> marker.h), it gives the following error :
>>>>
>>>> In file included from /usr/local/include/ust/kcompat/kcompat.h:64,
>>>> from /usr/local/include/ust/kernelcompat.h:21,
>>>> from /usr/local/include/ust/marker.h:31,
>>>> from task.lbc.C:4:
>>>> /usr/local/include/ust/kcompat/jhash.h: In function ‘u32 jhash(const
>>>> void*, u32, u32)’:
>>>> /usr/local/include/ust/kcompat/jhash.h:47: error: invalid conversion
>>>> from ‘const void*’ to ‘const u8*’
>>>
>
> Should be fixed in the latest git. I expect the error here was due to
> the usage of strict aliasing rules. This is now disabled in ust because
> the lttng code assumes -fno-strict-aliasing.
>
>
>>>> Here's what i see in the resulting trace directory:
>>>>
>>>> http://pastebin.com/raw.php?i=TqJWarkA
>>>>
>>>> it seems that all traces aren't recorded (some metadata and ust are
>>>> zero
>>>> size), i can't understand why, i must be missing something here... ?
>>>
>>> -f should fix this.
>>>
>>>>
>>>> also, when using lttv, it says "Cannot open trace : maybe you should
>>>> enter in the trace directory to select it ?"
>>>> i note that lttv can open a subdirectory (like
>>>> /root/.usttraces/californium-20100329172748911948191/8494_5454077942969735172
>>>>
>>>> in my example), but obviously all i can see is a subset of what really
>>>> happenned...
>
> I just looked at this more closely.
>
thanks for your help
> If your program does not fork (nor clone without the CLONE_VM flag),
> it's still possible you are tracing several processes. This would be
> caused by the fact that the command you are passing to usttrace is not
> an elf executable but rather a shell script that does many things like
> run commands, which starts new processes. For example, a shell script
> that uses find, sed, ls etc. will start them in their distinct process.
> The tracer will trace everything. To find out what these processes are,
> you could run the same command with strace -f and see what is going on.
>
the command i launch is actually an executable, not a script. What it
does is quite simple :
- the application registers some events to be processed
- it creates one thread per core on the machine
- each thread executes a loop to execute events
after all events has been processed, each created thread calls
pthread_exit, and the application exits normally.
the output of strace -f shows that there is no call to fork in there,
but clone is called multiple times.
that leads me to a question about how to correctly enable the CLONE_VM
flag for ust : is enabling it through the shell environment (export
CLONE_VM=1) sufficient ?
is there another way to do it ?
> The fact that some of the files are empty means that the daemon was
> unable to connect to the processes at some point. Indeed, you are
> getting printed errors about this. But it is initially able to because
> we see there is at least one file that is not empty. This is unlikely
> due to the fact the process stops existing too quickly. This could
> happen because of an exec() or an exit(), but there is a keep alive
> mechanism (triggered by ustctl -f in the case of the exec) that induces
> a delay in these calls when the ustd did not have the time to connect to
> all buffers yet.
>
> Either there is a bug in this mechanism, or something crashes. Could be
> a bug in your program or in ust. When the traced program crashes, only
> the buffers that were already connected can be recovered. Enabling core
> dumps (ulimit -c) and checking if there are core files in the directory
> after execution could help finding if you have segfaults or similar
> crashes.
That's a good point.
nothing crashes explicitely (to be sure i ran my program after "ulimit
-c unlimited" and they were no resulting core dump).
>
> Did you have a look at ustd.log and app.log in the trace directory? They
> could give additional information.
i did not looked at it enough, since there is some interesting lines
near the end of the files :
in ustd.log, when grepping with Error i get multiple lines like this :
ustd[17643/17973]: Error: unable to parse response to get_pidunique (in
connect_buffer() at ustd.c:268)
ustd[17643/17973]: Error: failed to connect to buffer (in
consumer_thread() at ustd.c:581)
libustcomm[17643/17995]: Error: connect (path=/tmp/ust-app-socks/17992):
Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
when grepping with die there are a lot of :
ustd[17643/17710]: application died while putting subbuffer (in
consumer_loop() at ustd.c:513)
ustd[17643/17687]: app died while being traced (in get_subbuffer() at
ustd.c:75)
ustd[17643/17745]: For buffer metadata_2, the trace was not found. This
llibustcomm[17643/17747]: sent message "put_subbuffer ust_0 0" (in
send_messagustd[17643/17745]: application died while putting subbuffer
(in consumer_loop() at ustd.c:513)
ustd[17643/17995]: Warning: unable to connect to process, it probably
died before we were able to connect (in connect_buffer() at ustd.c:250)
in app.log, i can see some lines like this :
libust[17924/17928]: Cannot find trace. It was likely destroyed by the
user. (in do_cmd_put_subbuffer() at tracectl.c:717)
sorry not to have seen this before...
from what i understand, it seems ust is seeing my application dying
unexpectedly. It seems weird to me because the application appears to be
exiting gracefully (no segfault, nor else). What could i do to check
that more thoroughly ?
Are there some ust calls to add in my application to notify ust we will
stop tracing, or the thread will exit ?
>
> It could help to just try to run the raw executable to usttrace. It
> would diminish the complexity of the operation. But it is supposed to
> work even with lots of processes like here.
>
> Thanks
>
> pmf
>
> _______________________________________________
> ltt-dev mailing list
> ltt-dev at lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
More information about the lttng-dev
mailing list