Sunday, 19 December 2021

Mt Pirongia 2021

On Friday I took advantage of the Auckland border having opened (on Wednesday) to travel down to Mt Pirongia and tramp to the summit, staying at Pahautea Hut overnight and then walking out again on Saturday (yesterday). This is the second time I've done Mt Pirongia (the last one was April 2016). It was intense but pretty great!

IIRC last time we took the shortest route — Corcoran Rd end, up Tirohanga Track to Ruapane peak and then along the track to Pirongia summit, returning via the same route. This time we did a loop from the Grey Road end: taking the link track to Ruapane Track, then joining the Tirohanga track to Pirongia summit, then back down Mahaukura Track to the car park via Mahaukura and Wharauroa peaks. It's definitely longer this way but you see and do more.

Pirongia is extremely rugged and the tracks reflect this. There aren't many steps or boardwalk sections and the tracks stick to the ridgelines, and there are many peaks along those ridges (the remains of many volcanic cores), so you're constantly scrambling up and down steep slopes with the aid of rocks and roots. Where the rock faces are nontrivial, chains have been installed to help with the climbing. Pirongia gets a lot of rainfall and its soils don't drain well so there's a lot of mud along the way. Although the tracks are quite short horizontally, they're hard going. Good fitness, good boots, and determination are all pretty important here. But Pirongia isn't huge so you should get there in the end.

On Friday night the hut was not very full — twenty bunks but only seven people there, me and my three friends and another group, three young women. We got talking to their leader, who told us about her extensive tramping experience and an upcoming 10-day tramp around the infamous Northwest Circuit on Stewart Island that she was organising. Later she mentioned she's 17 years old. I was a bit flabbergasted to be honest. Good for her, and well done to her parents!

There's a real shortage of tramping huts in the Auckland region. Within a two-hour drive there's really only Pahautea at Pirongia and Crosbie's/Pinnacles in the Kauaeranga Valley in Coromandel, as far as I know. The latter are super busy and Pirongia is just a bit too hard for inexperienced trampers. But if you are fit and at least a little bit experienced, it's good option.

Saturday, 18 December 2021

Do We Really Need A Link Step?

mold looks pretty cool, and a faster drop-in ld replacement is obviously extremely useful. But having no link step at all would be even faster. Why do native-code compilers write out temporary object files which then have to be linked together, anyway? Could we stop doing that and have compilers emit compiled translation units directly into a final executable file that the OS can execute directly --- a "zero-link" approach? I think we could ... in many cases.

The basic idea is to treat the final executable file (an ELF file, say) as a mutable data structure. When the compiler would emit an object file it instead allocates space in that executable file using a shared memory allocator, and writes the object code directly into that space. To make this tractable we'll assume we aren't going to generate optimal code in size or space; we're going to build an executable that runs "pretty fast", for testing purposes (manual or automated).

An obvious problem is how to handle symbol resolution, i.e. what happens when the compiler emits code for a translation unit that uses symbol A from some other unit --- especially if that other unit hasn't been compiled yet? Here's an option for function symbols: when A is used for the first time, write a stub for A to the final binary and call that. When a definition for A is seen, patch the stub to jump to the definition. Internally this will mean maintaining a parallel hashtable of all undefined symbols that all compiler instances can share efficiently.

For data symbols, instead of using a stub, we can emit a pointer that can be patched to point to the data definition. For complex constants, we might need to defer initialization until load time or emit code to initialize them lazily.

To challenge the design a bit more, let's think about why object files are useful.

Sometimes compilers emit object files for a project which are then linked into multiple different output binaries. True, but it's more efficient to link them once into a single shared library which is then loaded by each of those output binaries, so projects should just do that.

Compilers use object files for incremental compilation: when a translation unit hasn't changed, its object file can be reused. We can capture the same benefits with the zero-link approach: reuse the final executable and keep around its symbol hashtable; when an object file changes, release the object file's space in the final executable, and allocate new space for the new object file.

You can combine multiple object files into a static library and the linker will select the object files that are needed to satistfy undefined symbols. In many projects this feature is only used for "system libraries" --- a project's build system should avoid building project object files that will not be used in the final link. System libraries are usually dynamically linked for sharing reasons. When we really need to subset static libraries, we could link objects from those libraries into our final executable on-demand when we first see them being used.

Another issue is debuginfo (particularly important to me!) Supporting debuginfo would require extending DWARF5 debuginfo sections to allow their data to be scattered over the final executable.

There are lots of unresolved questions, enough that I wouldn't bet money on this actually being practical. But I think it's worth questioning the assumption that we have to have a link step for native binaries.

Update Zig can do something like this.