[lttng-dev] Problem with UST related to dlload

Gerlando Falauto gerlando.falauto at keymile.com
Tue May 27 07:55:04 EDT 2014


Hi Martin,

I have been struggling for a while with this issue (see the whole thread):

http://lists.lttng.org/pipermail/lttng-dev/2014-May/023035.html

and landed on the same conclusions as yours (found your message by 
searching for __start___tracepoints_ptr!).
So at least you're not alone!

So, did you ever manage to get any of your questions answered:

 >> 1) Have you run into a problem like this? Is there a known 
fix/workaround?
 >> 2) __start____tracepoints_ptrs is declared as extern in 
tracepoint.h, but it
 >> is not defined. This appears to be some sort of undocumented linker 
magic.
 >> http://gcc.gnu.org/ml/gcc-help/2010-04/msg00120.html is the only 
reference I
 >> could find. Do you know where this behavior is documented or 
specified (if
 >> at all)?
 >> 3) Do you know why the symbol visibility for __start___tracepoints_ptrs
 >> changed between 4.6.4 to 4.7.2?

Thank you so much!
Gerlando

BTW, I'm also running GCC 4.7.2 (lttng-ust is cross-compiled, the test 
application is natively compiled).

On an x86_64 host running either GCC 4.4.6 or 4.4.7, the issue is not 
observed.


On 04/30/2014 11:57 PM, Martin Ünsal wrote:
> Incidentally I also asked for help on the GNU linker-specific part
> (question 2) here:
>
> http://gcc.gnu.org/ml/gcc-help/2014-04/msg00164.html
>
> Martin
>
>
> On Wed, Apr 30, 2014 at 2:21 PM, Martin Ünsal <martinunsal at gmail.com> wrote:
>> Hi LTTng folks
>>
>> I have a strange problem using LTTng-UST on an ARM based platform. I have
>> done some diagnosis but I am running low on ideas and was hoping for help
>> from the experts. I am using lttng-tools 2.2.0, lttng-ust 2.2.0, liburcu
>> 0.8.1. I know these are old but upgrading is easier said than done
>> unfortunately. I didn't see anything related to this problem in relnotes,
>> mailing list traffic, or master branch, but I could have missed something.
>>
>> The problem showed up when I switched from GCC 4.6.4 to 4.7.2. Conceptually,
>> the situation is that I have a single executable, call it MyProgram, with
>> two plugins loaded at runtime with dlopen(), lets call them libPlugin1.so
>> and libPlugin2.so. There are three different LTTng-UST tracepoint providers,
>> one each for the executable and the two plugins. With GCC 4.7.2, tracepoints
>> in libPlugin1 stopped working. The tracepoints in MyProgram and in
>> libPlugin2 continue to work correctly.
>>
>> I have established without a doubt that the toolchain upgrade is the cause
>> of the regression.
>>
>> In the debugger, I confirmed that the tracepoint for libPlugin1.so is being
>> executed, but __tracepoint_##provider##___##name.state is always 0 even when
>> I enable the tracepoint in lttng-tools. As a result the tracepoint callback
>> is not being invoked when it should be. In MyProgram and libPlugin2.so, the
>> .state variable correctly reflects whether the tracepoint is enabled, and if
>> the tracepoint is enabled, the tracepoint callback is invoked.
>>
>> Next I set a breakpoint in tracepoint_register_lib() and looked at
>> tracepoints_start parameter.
>>
>> 1) With GCC 4.6.4 everything is as expected:
>>     a) tracepoint_register_lib() for MyProgram called with
>> MyProgramProvider's __start___tracepoints_ptrs.
>>     b) tracepoint_register_lib() after libPlugin1 dlopen() called with
>> libPlugin1Provider's __start___tracepoints_ptrs
>>     c) tracepoint_register_lib() after libPlugin2 dlopen() called with
>> libPlugin2Provider's __start___tracepoint_ptrs
>>
>> 2) With GCC 4.7.2 there is a problem:
>>     a) tracepoint_register_lib() for MyProgram called with
>> MyProgramProvider's __start___tracepoints_ptrs.
>>     b) tracepoint_register_lib() after libPlugin1 dlopen() called with
>> MyProgramProvider's __start___tracepoints_ptrs (!!!! THIS IS WRONG !!!!)
>>     c) tracepoint_register_lib() after libPlugin2 dlopen() called with
>> libPlugin2Provider's __start___tracepoint_ptrs
>>
>> I looked at the symbol table for libPlugin1.so to see if it would shed some
>> light on the problem.
>>
>> 1) With GCC 4.6.4:
>> # objdump -t /usr/lib/.debug/libPlugin1.so | grep __start___tracepoints_ptrs
>> 00025bb0 l       *ABS* 00000000 __start___tracepoints_ptrs
>> # objdump -t /usr/lib/.debug/libPlugin2.so | grep __start___tracepoints_ptrs
>> 00041eb4 l       *ABS* 00000000 __start___tracepoints_ptrs
>>
>> 2) With GCC 4.7.2:
>> # objdump -t /usr/lib/.debug/libPlugin1.so | grep __start___tracepoints_ptrs
>> 00025a90 g       __tracepoints_ptrs 00000000 __start___tracepoints_ptrs
>> # objdump -t /usr/lib/.debug/libPlugin2.so | grep __start___tracepoints_ptrs
>> 00041eb4 g       __tracepoints_ptrs 00000000 __start___tracepoints_ptrs
>>
>> My hypothesis at this point is that since __start___tracepoints_ptrs changed
>> from a local to a global symbol, the dynamic loader no longer knows how to
>> select the correct weak symbol. I cannot explain why libPlugin2 still loads
>> its provider correctly, perhaps it is just getting lucky.
>>
>> A few questions come to mind...
>> 1) Have you run into a problem like this? Is there a known fix/workaround?
>> 2) __start____tracepoints_ptrs is declared as extern in tracepoint.h, but it
>> is not defined. This appears to be some sort of undocumented linker magic.
>> http://gcc.gnu.org/ml/gcc-help/2010-04/msg00120.html is the only reference I
>> could find. Do you know where this behavior is documented or specified (if
>> at all)?
>> 3) Do you know why the symbol visibility for __start___tracepoints_ptrs
>> changed between 4.6.4 to 4.7.2?
>>
>> Thanks for any help. This is a real puzzler for me.
>>
>> Martin
>>
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev at lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>




More information about the lttng-dev mailing list