Tuesday, 29 August 2017

Fedora/Ubuntu Kernels Work With rr Again

Ubuntu finally released a kernel update (4.4.0-93) that fixes the regression that broke rr. It took a month after the regression was fixed upstream. The slow update cycle has been frustrating, and it's a bit worrying: I hope security fixes are treated with more urgency!

Fedora updated F25 last week (updating to a 4.12 kernel) and F26 was fixed earlier, so I think we're mostly out of the woods on this issue. I will prepare an rr 4.6.0 release.

Sunday, 27 August 2017

Igloos Are Hard

This weekend I had a bit of an adventure. My friend wanted to try making an igloo using igloo-making tools he had obtained, so we drove down to Mount Ruapehu early Saturday morning (four hours) and caught a shuttle up to the bottom of the Whakapapa ski area, then hiked through the snow for a couple of hours to reach his preferred site near the NZ Alpine Club's Ruapehu Hut, near Knoll Ridge, at an altitude of around 2040m. It's a wonderful spot, completely exposed on a rock ridge with great views of the surrounding area.

For various reasons, including congestion further down the mountain and a visit to the Knoll Ridge Cafe about 20 minutes walk from our site, we didn't start working on the igloo until about 4:30pm. A couple of hours of hard work later, we had completed at most one-third of the igloo and it was dark, getting colder, and I, for one, was exhausted. So we decided to abandon completing our igloo and just sleep out in the open in our bivvy bags, inside the ring we'd built. (A bivvy bag is a kind of tiny tent that's waterproof and insulating, just big enough to sleep in.) We boiled some water, had some hot drinks and rehydrated food, and went to bed around 8pm.

During the night light rain fell for a while, and later of course it got really cold. I was wearing a lot of clothing inside my sleeping bag and I still wasn't quite comfortably warm, and it was hard to find a comfortable sleeping position, and the bivvy bag was (by design, I think) quite stuffy, making breathing a bit more laboured than I'm used to. I did not sleep very well. The wind stayed light, which helped, but whatever transpired we would never have been in any actual danger, since the Alpine Club's hut was only about 30 metres away and the occupants would presumably have let us in if we'd begged them to save our lives.

Due to the rain and freezing temperatures overnight, when we got up around 7am our bivvy bags and the gear not inside them were all covered in ice. It took me a while to just open the zipper on my bag because of the ice — I pondered how embarrassing it would be to be trapped inside my bag! Then I realized that although I had taken into my bivvy bag everything I thought needed to be kept dry, I hadn't thought through the consequences of my other gear freezing. For example my waterproof-but-frozen camera stopped working (but fortunately after I thawed and recharged the battery, it works again). Frozen and ice-encrusted bags, shoelaces, straps, and zippers are no fun to manipulate with numb fingers!

Nevertheless it was a beautiful morning on the mountain and it wasn't too difficult to pack everything up and move out. Getting down the mountain and driving home was relatively uneventful.

This was my first time using crampons and an ice-axe. I didn't have much trouble but the conditions were not very difficult. Just as in non-icy conditions, I'm tentative going down steep slopes but hopefully I can get better at that. If I go hiking in the snow again I would definitely plan to sleep in a hut or a tent, or in an emergency a bivvy bag, not in a snow cave or igloo that require a lot of work to construct after reaching one's destination.

Tuesday, 22 August 2017

Epsom Electorate Town Hall Meeting

New Zealand's general election is in a month. Tonight I went along to a town-hall meeting with the candidates standing in the Epsom electorate where I live. I thought all the candidates were good, except perhaps New Zealand First's candidate who seemed a bit green (but he was only in his twenties). The candidates were eloquent, witty, mostly respectful, mostly made reasonable proposals to fix problems, all showed a grasp of facts and figures, and all seemed fit to serve as a Member of Parliament.

Epsom has complex electoral dynamics. Being reputedly one of the most right-wing electorates in New Zealand, the two small parties to the right of National (the big centre-right party) focus most of their energy on winning the Epsom electoral vote; under NZ's MMP system, this entitles their parties to receive seats proportional to their party vote even if that party vote is less than 5%, in which case they would normally receive no seats. National traditionally games the system a little bit by encouraging their voters to give their party vote to National but their electoral vote to the ACT candidate, on the expectation that this helps National because the otherwise "wasted" ACT party votes will put ACT MPs in Parliament who will align with National. Therefore tonight we had candidates from National (Paul Goldsmith), Labour (David Parker), the Greens (Barry Coates), and New Zealand First (Julian Paul), and also the ACT and Conservative party leaders (David Seymour and Leighton Baker) standing as candidates. National polling has the left-wing and right-wing coalitions reasonably close at this stage, with the Greens only just short of the 5% threshold and New Zealand First over it, so all but the Conservative party has a realistic chance of being part of a governing coalition.

It's sad there are no women candidates in Epsom this time around. David Parker claimed half of Labour's candidates are women.

On the issues, the candidates mostly said what you'd expect. People seemed to agree about problems — wealth inequality, housing, transport, education, environment (especially water quality), youth suicide — except that Paul Goldsmith obviously had to paint a more optimistic picture than the others. They often (not always) disagreed about the best way to solve them. I was surprised to learn that ACT supports a carbon tax.

In several cases candidates obviously did quick research on their phones to gather data before their turn to respond to a question, or to follow up on a previous answer. That was cool.

I wanted to ask why New Zealand should sign the "TPP as is, minus USA" deal as National is proposing, given the intellectual property concessions that are mainly there for the benefit of US companies, which could only make sense for us if we got something in return from the USA ... but I didn't get a chance to ask it :-(.

Given Epsom's reputation as a haven for right-wing parties, it was interesting that when David Seymour described Labour as more left-wing than it has been for years, a lot of people cheered.

Overall I'd say the Green and Conservative candidates impressed me the most, partly because I had lower expectations for them. Leighton Baker has some interesting ideas I hadn't heard before, like offering trade-oriented high school streams, which I think sounds great except it won't fly because people overvalue university degrees. Barry Coates came across as informed and capable, so I wonder why the Greens are wasting him in Epsom. David Parker came across as a bit over-snarky but I think he made what was for me the most compelling argument of the night: that New Zealand's tax structure favours property investment over business investment and Labour will do a better job than National of fixing this.

I think New Zealanders should be pretty proud of the quality of our political system and politicians.

Monday, 14 August 2017

Public Service Announcement: "localhost" Is Not Necessarily Local

Today I learned that there exist systems (presumably misconfigured, but I'm not sure in what way) where the hostname "localhost" does not resolve locally but is sent to some remote DNS server, and then in some cases the DNS server returns a remote address (e.g. a server providing landing pages stuffed with ads).

This was breaking rr, since rr tells gdb to use (e.g.) the command "target extended-remote :1234", and apparently gdb resolves "localhost" to get the address to connect to. I've fixed rr to pass "127.0.0.1" as an explicit local address, but who knows what other software is broken in such a configuration — possibly in insecure ways?

Sunday, 13 August 2017

When Virtue Fails

This quote, popularly (but incorrectly) attributed to Marcus Aurelius, proposes indifference to religion:

Live a good life. If there are gods and they are just, then they will not care how devout you have been, but will welcome you based on the virtues you have lived by. If there are gods, but unjust, then you should not want to worship them. If there are no gods, then you will be gone, but ... will have lived a noble life that will live on in the memories of your loved ones.

But this raises the question — what if you fail to live a truly good life, in the eyes of the god(s)?

The gospel — literally, the good news — about Jesus is that indeed we all fall short, but God sent Jesus into the world to take the punishment that we deserve, and through him we can receive forgiveness.

There is, of course, a catch. We have to accept that forgiveness, and that requires taking Jesus seriously. Thus psuedo-Aurelian indifference breaks down.

Monday, 7 August 2017

Stabilizing The rr Trace Format With Cap’n Proto

In the past we've modified the rr trace format quite frequently, and there has been no backward or forward compatibility. In particular most of the time when people update rr — and definitely when updating between releases — all their existing traces become unreplayable. This is a problem for rr-based services, so over the last few weeks I've been fixing it.

Prior to stabilization I made all the trace format updates that were obviously already desirable. I extended the event counter to 64 bits since a pathological testcase could overflow 2^31 events in less than a day. I simplified the event types to eliminate some unnecessary or redundant events. I switched the compression algorithm from zlib to brotli.

Of course it's not realistic to expect that the trace format is now perfect and won't ever need to be updated again. We need an extensible format so that future versions of rr can add to it and still be able to read older traces. Enter Cap’n Proto! Cap’n Proto lets us write a schema describing types for our trace records and then update that schema over time in constrained ways. Cap’n Proto generates code to read and write records and guarantees that data using older versions of the schema is readable by implementations using newer versions. (It also has guarantees in the other direction, but we're not planning to rely on them.)

This has all landed now, so the next rr release should be the last one to break compatibility with old traces. I say should, because something could still go wrong!

One issue that wasn't obvious to me when I started writing the schema is that rr can't use Cap’n Proto's Text type — because that requires text be valid UTF-8, and most of rr's strings are data like Linux pathnames which are not guaranteed to be valid UTF-8. For those I had to use the Data type instead (an array of bytes).

Another interesting issue involves choosing between signed and unsigned integers. For example a file descriptor can't be negative, but Unix file descriptors are given type int in kernel APIs ... so should the schema declare them signed or not? I made them signed, on the grounds that we can then check while reading traces that the values are non-negative, and when using the file descriptor we don't have to worry about the value overflowing as we coerce it to an int.

I wrote a microbenchmark to evaluate the performance impact of this change. It performs 500K trivial (non-buffered) system calls, producing 1M events (an 'entry' and 'exit' event per system call). My initial Cap’n Proto implementation (using "packed messages") slowed rr recording down from 12 to 14 seconds. After some profiling and small optimizations, it slows rr recording down from 9.5 to 10.5 seconds — most of the optimizations benefited both configurations. I don't think this overhead will have any practical impact: any workload with such a high frequency of non-buffered system calls is already performing very poorly under rr (the non-rr time for this test is only about 20 milliseconds), and if it occurred in practice we'd buffer the relevant system calls.

One surprising datum is that using Cap’n Proto made the event data significantly smaller — from 7.0MB to 5.0MB (both after compression with brotli-5). I do not have an explanation for this.

Another happy side effect of this change is that it's now a bit easier to read rr traces from other languages supported by Cap’n Proto.