Consider the following program:
#include <stdio.h>
int main(void)
{
printf("hello worldn");
return 0;
}
If I build it with GCC, optimizing for size, and with static linking, and then strip it for further size minimization (maybe):
$ gcc -Os -static hello.c -o hello
$ strip hello
I get an executable of size ~695 KB.
Why is it so big? I realize it’s not just my object code, and that there are stubs and what-not, but still, that’s kind of huge.
Notes:
- OS: Devuan GNU/Linux Chimaera (~= Debian Bullseye)
- Compiler: GCC 10.2
- libc: glibc 2.31-13
- Processor architecture: x86_64
- It doesn’t improve if I build with
-O3 -flto
.
3
Answers
A partial answer: The executable's inflated size...
printf
.<stdio.h>
.Why? Because even if you compile an empty program:
You still get the same 695 KB executable.
Thanks @SparKot for the comment indicating this direction.
Fundamentally the issue here is that GNU libc isn’t designed to be statically linked, which means, among other things, that the developers have not spent any time on reducing the size of statically-linked binaries.
I compiled your program with
-static
and also the special compiler argument-Wl,-Map,a.map
which asks the linker to write out a filea.map
(you can put any name you like after the second comma in that incantation) that explains why each object file was included in the link. These are the first few lines of that file, edited slightly for readability:What this means is that before the linker even looked at the code to your program, while it was still processing the transitive dependencies of the function that calls
main
, it needed to pull in the code that prints assertion failure messages, and that code pulls in the code for dynamically loading and printing localized (translated into the user’s native language) error messages. It looks like the bulk of your 600K of binary executable is that code and its dependencies, including among other things all ofmalloc
, all offprintf
, all oficonv
, the parser forgettext
"message object" files, …Why did you use
-static
on compiling your code?You compiled your code to be a static executable, so it linked all libraries (e.g. stdio) from
libc.a
, extracting the used modules and linking them inside the program code, instead of linking fromlibc.so
.printf()
(and many of the standard functions) unfortunately is a complex routine, that deals with buffering of input/output and makes your executable far bigger when you have to include the code in your program.If, on the other side, you allow your linker to do dynamic linking, you’ll get a dynamic executable around 16 kb and no extra loading work, as the libc is always preloaded for you in system memory (because almost every program uses it, so it is loaded as soon as the first program linked dynamically in the system starts — e.g.
systemd
orinit
). The kernel has no need to load the standard library if it is a shared object, as it is normally already loaded when your program start. This doesn’t happen when you link a program statically. The used code of the library is included in your executable, but as no shared code is present on it… the full megabyte has to be loaded in memory from the executable file, with a loading on start penalty in performance.You can get still under that size, but for that you have to do it in assembler and not link the standard c library and runtime module.
The next program, in assembler, reduces to 8488 bytes. besides the code used is only the following bytes (after disassembling) the rest is for ELF compliance (default relocation tables etc.)
the listing is:
hello.s
just 34 bytes program, but the executable is still 8kb after stripping. This is because it is formatted as an ELF binary executable, and this requires some extra data space for the empty symbol tables and the like (some sections are padded to a full page of code, probably one page for the
.text
section, and another for the.data
section)On the other side, if you want to see what is including in the linking phase, add
-v
to the compiler call, so it prints the command line used to call the linker, and you will see everything that is linked in the final executable.To compile/link the above program, just do: