[ltt-dev] LTT UserSpace Tracer, broken?
Pierre-Marc Fournier
pierre-marc.fournier at polymtl.ca
Fri May 28 12:51:46 EDT 2010
On 05/24/2010 01:11 PM, jpaul at gdrs.com wrote:
> Thanks Pierre-Marc. That will teach me to post something to a public board without double checking the interface first. The "3" below was a cut/paste issue from some glibc code (sched_getcpu.c) and I've replaced that coding line with:
>
> int r = syscall(SYS_getcpu,&cpu);
>
> I've verified the proper operation of the above call in a separate test program. I've rebuild everything after making that change. Unfortunately, that does not get rid of the segmentation fault with usttrace:
>
> # usttrace ./ustTest
> /usr/local/bin/usttrace: line 156: 20724 Segmentation fault $CMD 2>&1
> Waiting for ustd to shutdown...
> Trace was output in: /root/.usttraces/machineName-20100524100514656225139
>
> Nor does this resolve the issue with the application seg-faulting with ustd:
>
> # export UST_AUTOPROBE=1
> # gcc -o ustTest ustTest.c -lust
> # mkdir /tmp/trace<- ust-app-socks already present
> # ustd&
> # ./ustTest&
>
> # ustctl --create-trace 20798
> # ustctl --start-trace 20798
>
> libustcomm[20795/20812]: Error: connect (path=/tmp/ust-app-socks/20798): Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
> ustd[20795/20812]: Warning: unable to connect to process, it probably died before we were able to connect (in connect_buffer() at ustd.c:250)
> ustd[20795/20812]: Error: failed to connect to buffer (in consumer_thread() at ustd.c:581)
> libustcomm[20795/20813]: Error: connect (path=/tmp/ust-app-socks/20798): Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
> ustd[20795/20813]: Warning: unable to connect to process, it probably died before we were able to connect (in connect_buffer() at ustd.c:250)
> ustd[20795/20813]: Error: failed to connect to buffer (in consumer_thread() at ustd.c:581)
> libustcomm[20795/20814]: Error: connect (path=/tmp/ust-app-socks/20798): Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
> ustd[20795/20814]: Warning: unable to connect to process, it probably died before we were able to connect (in connect_buffer() at ustd.c:250)
> ustd[20795/20814]: Error: failed to connect to buffer (in consumer_thread() at ustd.c:581)
> libustcomm[20795/20815]: Error: connect (path=/tmp/ust-app-socks/20798): Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
> ustd[20795/20815]: Warning: unable to connect to process, it probably died before we were able to connect (in connect_buffer() at ustd.c:250)
> ustd[20795/20815]: Error: failed to connect to buffer (in consumer_thread() at ustd.c:581)
> ustd[20795/20810]: Error: failed to connect to buffer (in consumer_thread() at ustd.c:581)
> ustd[20795/20811]: Error: failed to connect to buffer (in consumer_thread() at ustd.c:581)
> [6]+ Segmentation fault ./ustTest
>
> # ls /tmp/ust-app-socks/
> 20798 ustd
>
> I'm guessing that ustd is complaining as my test application dumped and is no longer active. Looking at a core dump of my test app, it appears that the seg fault occurred at the following line of _rcu_read_unlock():
>
> _STORE_SHARED(rcu_reader->ctr, rcu_reader->ctr - RCU_GP_COUNT);
>
> Which was called from ltt_vtrace(). But that only seems to fail when the syscall(getcpu) returns with a -1. I actually changed ltt_vtrace() code as follows:
>
> {
> // cpu = ust_get_cpu();
> int r = syscall(SYS_getcpu,&cpu);
> if (r == -1)
> cpu = r;
> if (cpu == -1)
> printf(".. invalid cpu %s (%d)\n", strerror(errno), errno);
> }
>
> And had the following print out:
>
> .. invalid cpu Bad address (14)
>
> So ... it appears that something isn't working correctly to make that
> syscall here. Not really sure why this is failing .. maybe a thread
> related issue? It doesn't fail every time. Maybe best to upgrade the
> latest glibc and try again with the inline methods? It is important
> to note that the following code comes directly from glibc-2.11 and
> sched_getcpu() can return a -1 upon a failed INLINE_SYSCALL. Would
> suggest that ltt_vtrace() be changed to properly handle a -1 cpu
> value:
>
I believe the above code is failing some of the time with an "invalid
address" because some pointers are missing in the call. You have only 1
argument and you need 3.
I am not too enthousiastic at the idea of adding error checking for the
getcpu call. The call should never fail and this is in the critical path
of the tracer.
I would consider a patch with some preprocessor logic that chooses the
right call based on the one available on the system. However, this patch
must take into account the latest kernels which provide getcpu as a vdso.
By the way, you will get considerable performance penalty with this old
libc. UST tries very hard not to make system calls in the tracing
critical path because they are slow. The recent kernels/glibc's provide
getcpu/sched_getcpu as a vdso, which helps a lot. If you are doing a
real system call in the tracing path, this will result in a penalty.
pmf
More information about the lttng-dev
mailing list