WHIZARD: Build and link time comparison
BFD, Gold or LLD
Someday before, I visited my favorite (hacker) news site, and by chance, I stumbled upon an at-first-glance interesting linker project: https://github.com/rui314/mold. Especially, this linker project made me curious about the other linkers in production:
- ld.bfd and ld.gold are part of the GNU binutils,
- ld.lld is the currently fastest linker providen by LLVM.
The ld.bfd
is the default linker of the GNU toolchain, ld.gold
was its supposed-to-be successor.1
However, it seems that GNU gold lost its favorite position to LLVM's ld.lld
.
The details of their connection to each other and their history is interesting, but of no further interest for us.
I wanted to assess the build time difference for WHIZARD
between the three different linkers in usage: default, gold or lld.
Although I wouldn't expect a large deviation between the build times as WHIZARD
is mostly written in modern Fortran.
I use hyperfine as benchmarking tool as it provides a neat set of features, i.e. statistical anlysis, pre- and post-invocations, for example.
In order to facilitate possible interference between the benchmarks and my user activity - I am doing the benchmarks during my productive time - I scan over the number of physical cores of my laptop.
In principle, if there is a difference between the linker time for the WHIZARD
build, the difference should be (mostly) independent of the cores used for the build.
We invoke GCC with a different linker with -fuse-ld={bfd|gold|lld}
, https://gcc.gnu.org/onlinedocs/gcc-4.9.1/gcc/Optimize-Options.html#Optimize-Options.
Thus, we need to convey the flag across WHIZARD
's build system, Autotools.
And, that is a problem.
[Longish paragraph why I failed to invoke GCC with above flag.]
The upshot of above (abbreviated) paragraph is: Autotools and libtool perform quite a lot of magic to assess the right compile and link flags, i.e. library and linker flags.
Especially, both need to understand howto interpret and forward -fuse-ld
to the link-invocation of the compiler.
I performed a quick search into the Autotools documentation, and also peeked at WHIZARD
's m4/libtool.m4
, and found out that configure
tries to guess whether the system provides the GNU BFD linker (or gold), and set the variable LD
accordingly.
I tried different combinations of GCC flags, LD
environment variable and configure flags.
A short list of my trials:
- Adding
-fuse-ld=lld
to CFLAGS - Adding
-fuse-ld=lld
to CFLAGS and-with-gnu-ld
- Adding
-fuse-ld=lld
andLD=ld.lld
- Adding
-fuse-ld=lld
,--with-gnu-ld
andLD=ld.lld
- Adding
-fuse-ld=lld
to CFLAGS and FCFLAGS - …
In the end, in neither of my trials, libtool recognized the -fuse-ld=ldd
option (it works with bfd
and gold
, obviously?).
Although, I am not sure whether libtool is at fault, or the libtool.m4
, which is used by Autotools to generate the correct invocation of libtool.
However, the -fuse-ld=lld
flag is correctly inserted (by $FCFLAG
), but again, is not forwarded by libtool to the link invocation of gfortran
.
/bin/sh ../../libtool --tag=FC --mode=link gfortran -O0 -g -fuse-ld=lld -o libcirce1.la -rpath /home/sbrass/whizard/master-debug/install/lib circe1.lo -ldl libtool: link: gfortran -shared -fPIC .libs/circe1.o -ldl -O0 -g -Wl,-soname -Wl,libcirce1.so.0 -o .libs/libcirce1.so.0.0.0
The LLD developer are genius - actually, they have the same problem2 - and hint at either using the -fuse-ld
flag (as I did) or by linking ld
against ld.lld
.
And - this is not a joke - it works. It ain't stupid, if it works.
We can verify that everything works as expected with readelf --string-dump .comment <file>
.3
With new found knowledge, I glue together a short bash scripts which facilitates the linking of which-ever ld
I want to use in a temporary directory and adds the temporary directory to PATH
.
help() { ## Benchmark with hyperfine a complete WHIZARD make build excluding configure on a clean directory structure. sed -n 's/^\s*##//p' "${BASH_SOURCE[0]}" >&2 exit 1 } USE_LD="bfd" ## Arguments: # https://wiki.bash-hackers.org/howto/getopts_tutorial while getopts ":hl:" opt; do case $opt in ## -l :: choice of linker [default: bfd] (see below) h) help ;; ## -h :: print help message and exit l) USE_LD="${OPTARG}" ;; \?) echo "Invalid option: -$OPTARG." >&2 exit 1 ;; :) echo "Option -$OPTARG requires an argument." >&2 exit 1 ;; esac done ## Valid options for -l: case ${USE_LD} in ## - bfd :: default GNU linker "bfd") LD=ld ;; ## - gold :: alternative GNU gold linker "gold") LD=ld.gold ;; ## - lld :: LLVM's linker "lld") LD=ld.lld ;; *) if test -z "${USE_LD}"; then echo "No linker option chosen." >&2 else echo "Invalid option ${USE_LD}." >&2 fi exit 1 ;; esac TEMP_PATH="$(mktemp -d)" LD_PATH="$(which ${LD})" (set -x ln -s "${LD_PATH}" "${TEMP_PATH}/ld" ) PATH="${TEMP_PATH}:${PATH}" hyperfine \ --export-markdown "../bench-${USE_LD}.md" \ --prepare '../development-master/configure FCFLAGS="-O0 -g"' \ --cleanup 'make distclean' \ --runs 3 \ --parameter-scan threads 1 4 'make -j {threads}' rm -rf "${TEMP_PATH}"
And, within the errors reported by hyperfine
, there is no advantage in using either of the other linkers beside GNU's default linker with regard to the runtime.
Command | Mean / s | Min / s | Max / s | Relative / % |
---|---|---|---|---|
make -j 1 |
298.543 ± 4.714 | 293.333 | 302.514 | 1.08 ± 0.30 |
make -j 2 |
292.847 ± 89.613 | 240.676 | 396.322 | 1.06 ± 0.44 |
make -j 3 |
276.186 ± 77.157 | 228.825 | 365.218 | 1.00 |
make -j 4 |
278.080 ± 73.908 | 234.997 | 363.420 | 1.01 ± 0.39 |
Command | Mean / s | Min / s | Max / s | Relative / % |
---|---|---|---|---|
make -j 1 |
541.570 ± 122.198 | 464.746 | 682.480 | 1.93 ± 0.63 |
make -j 2 |
294.014 ± 83.005 | 242.146 | 389.748 | 1.05 ± 0.38 |
make -j 3 |
286.866 ± 82.224 | 239.011 | 381.810 | 1.02 ± 0.38 |
make -j 4 |
280.934 ± 65.941 | 233.332 | 356.200 | 1.00 |
Command | Mean / s | Min / s | Max / s | Relative / % |
---|---|---|---|---|
make -j 1 |
498.509 ± 142.891 | 401.433 | 662.589 | 1.82 ± 0.70 |
make -j 2 |
412.006 ± 124.456 | 332.525 | 555.435 | 1.50 ± 0.60 |
make -j 3 |
278.392 ± 80.875 | 230.958 | 371.774 | 1.02 ± 0.39 |
make -j 4 |
273.788 ± 70.320 | 232.909 | 354.986 | 1.00 |
Bear with me, the above results lack a meaningful rounding of the errors - it is not directly supported by hyperfine
.
But, hyperfine
allows us to export the results as a JSON-file.
Therefore, we could parse the JSON-file with Python and evaluate the numbers with the uncertainties
module to automate the rounding.
However, these numbers are not really meaningful at all: I did not use a clean benchmark system, so these number and my conclusion should be seen with a huge grain of salt at all.
Footnotes:
Gold (linker), https://en.wikipedia.org/w/index.php?title=Gold_(linker)&oldid=1005327625 (last visited Feb. 24, 2021).