skip to Main Content

Scenario:

$ cat lib.c
#include <stdio.h>
#define STR_(x) #x
#define STR(x) STR_(x)
#define CAT_(x,y) x##y
#define CAT(x,y) CAT_(x,y)

__attribute__((constructor))
void CAT(foo,)(void) { printf("foo" STR(N) " %pn", CAT(foo,)); }
void CAT(bar,N)(void){ puts("bar" STR(N)); }

$ cat main.c
void barx(void);
void bary(void);
void barz(void);

int main(void)
{
    barx();
    bary();
    barz();
}

$ cat build_run.sh
gcc lib.c -DN=x -c -fPIC -o libx.o && gcc libx.o -o libx.so -shared &&
gcc lib.c -DN=y -c -fPIC -o liby.o && gcc liby.o -o liby.so -shared &&
gcc lib.c -DN=z -c -fPIC -o libz.o && gcc libz.o -o libz.so -shared &&
export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH &&
gcc -L. -o main main.c -lx -ly -lz &&
./main

$ bash build_run.sh
foox 0x7f0bf002e139
foox 0x7f0bf002e139
foox 0x7f0bf002e139
barx
bary
barz

Here we see that:

  1. All .so libraries have constructor attributed function with the same name foo.
  2. Function foo is called 3 times from library X (which may be unexpected behavior) instead of 1 time from libraries X, Y, Z (which may be expected behavior).

As I understand, addresses of constructor attributed functions foo are placed (directly or indirectly) in .init_array section. Hence, function names are expected to be irrelevant.

The core question: why function foo is called 3 times from library X instead of 1 time from libraries X, Y, Z?


Extra observations:

  1. If we change in lib.c from CAT(foo,) to CAT(foo,N) and rerun the build_run.sh, then we will see:
$ bash build_run.sh
fooz 0x7fc121dcc139
fooy 0x7fc121dd1139
foox 0x7fc121dd6139
barx
bary
barz

which is expected behavior.

  1. Running the original (i.e. with CAT(foo,)) example on Cygwin leads to function foo is called 1 time from libraries X, Y, Z (which may be expected behavior).

System and software info:

$ uname -a
Linux xxx 5.15.0-79-generic #86-Ubuntu SMP Mon Jul 10 16:07:21 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

2

Answers


  1. I think it is an example of symbol interposition.

    When loading shared libraries, the dynamic linker (Linux) resolves symbols (function names, variables) globally by default. If two or more shared libraries define the same symbol (e.g., foo), the first-loaded version of the symbol is used for all subsequent references.

    Potential fixes:

    • I believe, you can use the -Bsymbolic linker option when compiling each shared library but I never tested it myself.
    • Making them static should also be effective

    Cygwin uses DLL libraries instead and Windows has a different mechanism.In Windows, each DLL has a distinct address space for its functions and variables, meaning symbol names are resolved independently within each DLL rather than globally across all loaded libraries.

    Login or Signup to reply.
  2. Hence, function names are expected to be irrelevant.

    No. They are still relevant. When dynamic library is loaded, OS is assigning each function an actual address. If two libraries has the same name, the function from the first loaded library is used, and function from second loaded library is ignored.

    This leads to a behavior you observed. The first library loads, registers foo(), runs it. The second library on load sees that foo() is already registered, in the space of this process, and runs it.

    Since you intent for the foo to be a constructor – then simply adding static to the foo definition will solve the issue. In this case, the function will not be visible outside the dynamic library and would, therefore be exempt from the name duplication between two libraries.

    Of course, you will loose ability to call foo() from outside the lib.c. If it still be needed – then you must have different names for the constructor functions (which you do by using CAT(foo,N)).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search