Fun with weak symbols

Consider the following foo.c source file:

extern int bar() __attribute__((weak));
int foo() {
  return bar();
}

And the following bar.c source file:

int bar() {
  return 42;
}

Compile both sources:

$ gcc -o foo.o -c foo.c -fPIC
$ gcc -o bar.o -c bar.c -fPIC

In the resulting object for foo.c, we have an undefined symbol reference to bar. That symbol is marked weak.

In the resulting object for bar.c, the bar symbol is defined and not weak.

What we expect from linking both objects is that the weak reference is fulfilled by the existence of the bar symbol in the second object, and that in the resulting binary, the foo function calls bar.

$ ld -shared -o test1.so foo.o bar.o

And indeed, this is what happens.

$ objdump -T test1.so | grep "\(foo\|bar\)"
0000000000000260 g    DF .text  0000000000000007 foo
0000000000000270 g    DF .text  0000000000000006 bar

What do you think happens if the bar.o object file is embedded in a static library?

$ ar cr libbar.a bar.o
$ ld -shared -o test2.so foo.o libbar.a
$ objdump -T test2.so | grep "\(foo\|bar\)"
0000000000000260 g    DF .text  0000000000000007 foo
0000000000000000  w   D  *UND*  0000000000000000 bar

The bar function is now undefined and there is a weak reference for the symbol. Calling foo will crash at runtime.

This is apparently a feature of the linker. If anyone knows why, I would be interested to hear about it. Good to know, though.

2012-02-23 10:46:50+0900

p.d.o, p.m.o

You can leave a response, or trackback from your own site.

10 Responses to “Fun with weak symbols”

  1. Z.T. Says:

    Did you ask on the binutils mailing list? People like Ian Lance Taylor should know.

  2. Neil Rashbrook Says:

    This is just a wild guess.

    The linker probably only resolves non-weak symbols by extracting archive members. Having extracted those archive members, it can probably use those members to resolve weak symbols as well.

    For example, *printf probably has a weak reference to the floating-point string conversion function. If you don’t use any floating-point in your code, you won’t trigger the extraction of the floating-point library, and *printf won’t be able to print floating-point numbers. But if you do use floating-point in your code, that will trigger the extraction of the appropriate object in the library, which will resolve the weak reference inside *printf.

  3. Octoploid Says:

    Either “–whole-archive” or “-u bar” should help.

  4. Diego Elio Pettenò Says:

    As Octoploid said those two options would help. The link editor (ld) will not see bar as totally undefined (as there is a weak reference for it, if you have weak references you’re _supposed_ to have a valid default value), so it won’t be used to decide that libbar.a is needed.

    And since there is no other symbol in libbar.a that foo.o requires, it’ll discard the library as unused/unrequested.

  5. mirabilos Says:

    What purpose do weak externs have, anyway?

    I only know weak on functions to allow them to be easily overridden, cf. the pthreads-related functions (where _thread_sys_open is the syscall caller and open is weak calling it, and libpthread provides a non-weak open).

  6. Diego Elio Petteno` Says:

    You can also use weak externs when you want backward compatibility with a given library without rebuilding the software.

    For instance you might want to use a function if it’s available in the version of the library you’re running against, because it’s faster than doing the same thing “manually”, but you don’t want to require all your users to link against a newer version of that library, so you use a weak reference, and check against the function’s pointer being non-zero.

    There are more tricks that relate to preloaded/interposed libraries (pthread as you noted falls in the latter category); most of the weakrefs tricks only concerns binary compatibility though, so they are used in prebuilt software (both proprietary and non; OpenOffice and Firefox are also using them as far as I can tell — LibreOffice is ususally not used prebuilt).

  7. Jie Says:

    It’s just written in the ELF standard:

    The link editor does not extract archive members to resolve undefined weak.

    Btw,
    [quote]
    What we expect from linking both objects is that the weak reference is fulfilled by the existence of the bar symbol in the second object, and that in the resulting binary, the foo function calls bar.

    $ ld -shared -o test1.so foo.o bar.o

    And indeed, this is what happens.
    [/quote]
    This is not true. Since you are building a shared library, you will not know if the reference to “bar” in “foo” will use the definition in bar.c or not until dynamic loader resolving the reference when running, even “bar” in foo.c is a non-weak undefine.

  8. glandium Says:

    Jie: well, that applies whether the symbol is weak or not.

  9. mirabilos Says:

    @Diego: Pointers to an object are always not NULL in ISO C, and GCC is known to optimise your check for that away.

    (Yes, ISO C is removed from reality, especially C1x…)

  10. Diego Elio Pettenò Says:

    Uhm I’m pretty sure I have found code working with similar checks and they were not optimised away. But it’s probably a matter of telling the compiler what you want to do there.

Leave a Reply