Sunday, 28 October 2018

Auckland Half Marathon 2018

This morning I ran the Auckland Half Marathon, for the sixth year in a row, fifth year barefoot. I got my best time ever: official time 1:45:07, net time 1:44:35. I had to push hard, and I've been sore all day! Glad to get under 1:45 net.

Last year I applied climbing tape to the balls of my feet to avoid the skin wearing through. Unfortunately I lost that tape and forgot to get more, so I tried some small patches of duct tape but they came off very early in the race. Nevertheless my feet held up pretty well, perhaps because I've been running more consistently during the year and perhaps because my feet are gradually toughening up year after year. The road was wet today because it rained overnight, but I can't tell whether that is better or worse for me.

Thursday, 25 October 2018

Problems Scaling A Large Multi-Crate Rust Project

We have 85K lines of Rust code implementing the backend of our Pernosco debugger. To impose some modularity constraints and to reduce build times, from the beginning we organized our code as a large set of crates in a single Cargo workspace in a single Gitlab repository. Currently we have 48 crates. This has mostly worked pretty well but as the number of our crates keeps increasing, we have hit some serious scalability problems.

The most fundamental issue is that many crates build one or more executables — e.g. command-line tools to work with data managed by the crate — and most crates also build an executable containing tests (per standard Rust conventions). Each of these executables is statically linked, and since each crate on average depends on many other crates (both our own and third-party), the total size of the executables is growing at roughly the square of the number of crates. The problem is especially acute for debug builds with full debuginfo, which are about five times larger than release builds built with debug=1 (optimized builds with just enough debuginfo for stack traces to include inlined functions). To be concrete, our 85K line project builds 4.2G of executables in a debug build, and 750M of executables in release. There are 20 command-line tools and 81 test executables, of which 50 actually run no tests (the latter are small, only about 5.7M each).

The large size of these executables slows down builds and creates problems for our Gitlab CI as they have to be copied over the network between build and test phases. But I don't know what to do about the problem.

We could limit the number of test executables by moving all our integration tests into a single executable in some super-crate ... though that would slow down incremental builds because that super-crate would need to be rebuilt a lot, and it would be large.

We could limit the number of command-line tools in a similar way — combine all the tool executables a super-tool crate that uses the "Swiss Army knife" approach, deciding what its behavior should be by examining argv[0]. Again, this would penalize incremental builds.

Cargo supports a kind of dynamic linking with its dylib option, but I'm not sure how to get that to work. Maybe we could create a super-crate that reexports every single crate in our workspace, attach all tests and binary tools to that crate, and ask Cargo to link that crate as a dynamic library, so that all the tests and tools are linking to that library. This would also hurt incremental builds, but maybe not as much as the above approaches. Then again, I don't know if it would actually work.

Another option would be to break up the project into separate independently built subprojects, but that creates a lot of friction.

Another possibility is that we should simply use fewer, bigger crates. This is probably more viable than it was a couple of years ago, when we didn't have incremental rustc compilation.

I wonder if anyone has hit this problem before, and tried the above solutions or come up with any other solutions.

Wednesday, 24 October 2018

Harmful Clickbait Headline About IT Automation

Over the years a number of parents have asked me whether they should steer their children away from IT/programming careers in case those jobs are automated away in the future. I tell them that the nature of programming work will continue to change but programming will be one of the last jobs to be fully automated; if and when computers don't need to be programmed by people, they will be capable of almost anything. (It's not clear to me why some people are led to believe programming is particularly prone to automation; maybe because they see it as faceless?)

Therefore I was annoyed by the headline in the NZ Herald this morning: "Is AI about to unseat our programmers?" Obviously it's deliberate clickbait, and it wasn't chosen by Juha Saarinen, whose article is actually quite good. But I'm sure some of the parents I've talked to, and many others like them whom I'll never know, will have fears reinforced by this headline and perhaps some people will be turned away from programming careers to their detriment, and the detriment of New Zealands's tech industry.

Monday, 22 October 2018

The Fine Line Between Being A Good Parent And A Bad Parent

Two incidents will illustrate.

Early 2013: I take my quite-young kids to hike to the summit of Mt Taranaki. The ascent is more grueling than I expected; there's a long scree slope which is two steps forward, one step back. My kids start complaining, then crying. I have to decide whether to turn back or to cajole them onward. There are no safety issues (the weather is perfect and it's still early in the day), but the stakes feel high: if I keep pushing them forward but we eventually fail, I will have made them miserable for no good reason, and no-one likes a parent who bullies their kids. I roll the dice and press on. We make it! After the hike, we all feel it was a great achievement and the kids agree we did the right thing to carry on.

Two weeks ago: I take my kids and a couple of adult international students from our church on an overnight hiking trip to the Coromandel Peninsula. On Friday we hike for four hours to Crosbies Hut and stay there overnight. It's wonderful — we arrive at the hut around sunset in glorious weather, eat a good meal, and the night sky is awesome. The next day I return to our starting point, pick up the car and drive around to Whangaiterenga campsite in Kauaeranga Valley so my kids and our guests can descend into the valley by a different route that crosses Whangaiterenga stream a few times. I had called the visitor's centre on Friday to confirm that that track is open and the stream crossings are easy. My kids are now quite experienced (though our guests aren't) and should be able to easily handle this on their own. I get to the pickup point ahead of schedule, but two hours after I expected them to arrive, they still haven't :-(.

To cut the story short, at that point I get a text message from them and after some communication they eventually walk out five hours late. They were unable to pick up the trail after the first stream crossing (maybe it was washed out), and had to walk downstream for hours, also taking a detour up a hill to get phone reception temporarily. The kids made good decisions and gained a lot of confidence from handling an unexpected situation on their own.

What bothers me is that both of these situations could easily have turned out differently. In neither case would there have been any real harm — the weather in Coromandel was excellent and an unexpected night in the bush would have been perfectly safe given the gear they were carrying (if indeed they weren't found before dark). Nevertheless I can see that my decisions could have looked bad in hindsight. If we make a habit of taking these kinds of small risks — and I think we should! — then not all of them are going to pay off. I think, therefore, we should be forgiving of parents who take reasonable risks even if they go awry.

Tuesday, 2 October 2018

The Costs Of Programming Language Fragmentation

People keep inventing new programming languages. I'm surprised by how many brand-new languages are adopted by more than just their creators, despite the network effects that would seem to discourage such adoption. Good! Innovation and progress in programming languages depend on such adoption. However, let's not forget that fragmentation of programming languages reduces the sum of those beneficial network effects.

One example is library ecosystems. Every new language needs a set of libraries for commonly used functionality. Some of those libraries can be bindings to existing libraries in other languages, but it's common for new languages to trigger reimplementation of, e.g., container data structures, HTTP clients, and random number generators. If the new language did not exist, that effort could have been spent on improving existing libraries or some other useful endeavour.

Another example is community support. Every new language needs an online community (IRC, StackOverflow, etc) for developers to help one another with questions. Fragmenting users across communities makes it harder for people to find answers.

Obviously the efforts needed to implement and maintain languages and runtimes themselves represents a cost, since focusing efforts on a smaller number of languages would normally mean better results.

I understand the appeal of creating new programming languages from scratch; like other green-field development, the lure of freedom from other people's decisions is hard to resist. I understand that people's time is their own to spend. However, I hope people consider carefully the social costs of creating a new programming language especially if it becomes popular, and understand that in some cases creating a popular new language could actually be irresponsible.