Wednesday 10 April 2019
Mysteriously Low Hanging Fruit: A Big Improvement To LLD For Rust Debug Builds
LLD is generally much faster than the GNU ld.bfd and ld.gold linkers, so you would think it has been pretty well optimised. You might then be surprised to discover that a 36-line patch dramatically speeds up linking of Rust debug builds, while also shrinking the generated binaries dramatically, both in simple examples and large real-world projects.
The basic issue is that the modern approach to eliminating unused functions from linked libraries, --gc-sections, is not generally able to remove the DWARF debug info associated with the eliminated functions. With --gc-sections the compiler puts each function in its own independently linkable ELF section, and then the linker is responsible for selecting only the "reachable" functions to be linked into the final executable and discarding the rest. However, compilers are still putting the DWARF debug info into a single section per compilation unit, and linkers mostly treat debug info sections as indivisible black boxes, so those sections get copied into the final executable even if the functions they're providing debug info for have been discarded. My patch tackles the simplest case: when a compilation unit has had all its functions and data discarded, discard the debug info sections for that unit. Debug info could be shrunk a lot more if the linker was able to rewrite the DWARF sections to discard info for a subset of the functions in a compilation unit, but that would be a lot more work to implement (and would potentially involve performance tradeoffs). Even so, the results of my patch are good: for Pernosco, our "dist" binaries with debug info shrink from 2.9GB to 2.0GB.
Not only was the patch small, it was also pretty easy to implement. I went from never having looked at LLD to working code in an afternoon. So an interesting question is, why wasn't this done years ago? I can think of a few contributing reasons:
People just expect binaries with debug info to be bloated, and because they're only used for debugging, except for a few people working on Linux distros, it's not worth spending much effort trying to shrink them.
C/C++ libraries that expect to be statically linked, especially common ones like glibc, don't rely on --gc-sections to discard unused functions. Instead, they split the library into many small compilation units, ideally one per independently usable function. This is extra work for library developers, but it solves the debug info problem. Rust developers don't (and really, can't) do this because rustc splits crates into compilation units in a way that isn't under the control of the developer. Less work for developers is good, so I don't think Rust should change this; tools need to keep up.
Big companies that contribute to LLD, with big projects that statically link third-party libraries, often "vendor" those libraries, copying the library source code into their big project and building it as part of that project. As part of that process, they would usually tweak the library to only build the parts their project uses, avoiding the problem.
There has been tension in the LLD community between doing the simple thing I did and doing something more difficult and complex involving DWARF rewriting, which would have greater returns. Perhaps my patch submission to some extent forced the issue.