Saturday, 23 May 2015

rr Performance Update

It's been a while (March 2014 to be precise) since I gathered meaningful rr performance numbers. I'm preparing a talk for the TCE 2015 conference and as part of that I ran some new benchmarks with mozilla-central Firefox. It turned out that numbers had regressed --- unsurprisingly, since we don't have continuous performance tests for rr, and a lot has changed since March 2014. In particular, Firefox has evolved a lot, our tests have changed, we're using x86-64 now instead of x86-32, and rr has changed a lot. Over the last few days I studied the regressions and fixed a number of issues: in particular, during the transition to x86-64 some of the optimizations related to syscall-buffering were lost because we weren't patching some important syscall callsites and we weren't handling the recvfrom syscall, which is common in 64-bit Firefox. I also realized that in some cases we were flushing much more data from the syscallbuf to the trace file than we'd actually recorded in the buffer, massively bloating the traces, and fixed that.

There are still some regressions evident since last March. Octane overhead has increased significantly. Forcing Octane to run on a single core without rr shows a similar overhead; in particular that alone causes one test (Mandreel) to regress by a factor of 10! My guess is that Spidermonkey is using multiple cores much more aggressively that it did last year and because it's carefully tuned for Octane, going back to a single core really hurts performance. Replay overhead on the HTML mochitests has also increased significantly; I think this is partly because we changed rr to disable syscall buffering on writes to standard output. This improves the debugging experience but it results in a lot more overhead during replay.

Overall though, I remain very happy with rr performance, especially recording performance, which is critical when you're trying to capture a test failure under rr. Replay performance is becoming more important since it impacts the debugging experience, especially reverse execution; but doing a lot of work to improve raw replay performance is low priority since I think there are projects that could provide a better improvement in the debugging experience for less work (e.g. the ability to take a checkpoint during a recording and start debugging from there, and implement support for gdb's evaluate-in-target conditional breakpoints).

Friday, 22 May 2015

BlinkOn 4

Last week I went to BlinkOn 4 in Sydney, having been invited by a Google developer. It was a lot of fun and I'm glad I was able to go. A few impressions:

It was good to hear talk about acting responsibly for the Web platform. My views about Google are a matter of public record, but the Blink developers I talked to have good intentions.

The talks were generally good, but there wasn't as much audience interaction as I'd expected. In my experience interaction makes most talks a lot better, and the BlinkOn environment is well-suited to interaction, so I'd encourage BlinkOn speakers and audiences to be a bit more interactive next time. I admit I didn't ask as many questions during talks as I usually do, because I felt the time belonged to actual Blink developers.

Blink project leaders felt that there wasn't enough long-term code ownership, so they formed subteams to own specific areas. It's a tricky balance between strong ownership, agile migration to areas of need, and giving people the flexibility to work on what excites them. I think Mozilla has a good balance right now.

The Blink event scheduling work is probably the only engine work I saw at BlinkOn that I thought was really important and that we're not currently working on in Gecko. We need to get going on that.

Another nice thing that Blink has that Gecko needs is the ability to do A/B performance testing on users in the field, i.e. switch on a new code path for N% of users and see how that affects performance telemetry.

On the other hand, we're doing some cool stuff that Blink doesn't have people working on --- e.g. image downscaling during decode, and compositor-driven video frame selection.

I spent a lot of time talking to Google staff working on the Blink "slimming paint" project. Their design is similar to some of what Gecko does, so I had information for them, but I also learned a fair bit by talking to their people. I think their design can be improved on, but we'll have to see about that.

Perhaps the best part of the conference was swapping war stories, realizing that we all struggle with basically the same set of problems, and remembering that the grass is definitely not all green on anyone's side of the fence. For example, Blink struggles with flaky tests just as we do, and deals with them the same way (by disabling them!).

It would be cool to have a browser implementors' workshop after some TPAC; a venue to swap war stories and share knowledge about how to implement all the specs we agreed on at TPAC :-).

Friday, 15 May 2015

Using rr To Debug Dropped Video Frames In Gecko

Lately I've been working on a project to make video rendering smoother in Firefox, by sending our entire queue of decoded timestamped frames to the compositor and making the compositor responsible for choosing the correct video frame to display every time we composite the window. I've been running a testcase with a 60fps video, which should draw each video frame exactly once, with logging to show when that isn't the case (i.e. we dropped or duplicated a frame). rr is excellent for debugging such problems! It's low overhead so it doesn't change the results much. After recording a run, I examine the log to identify dropped or dup'ed frames, and then it's easy to use rr to replay the execution and figure out exactly why each frame was dropped or dup'ed. Using a regular interactive debugger to debug such issues is nigh-impossible since stopping the application in the debugger totally messes up the timing --- and you don't know which frame is going to have a problem, so you don't know when to stop anyway.

I've been using rr on optimized Firefox builds because debug builds are too slow for this work, and it turns out rr with reverse execution really helps debugging optimized code. One of the downsides of debugging optimized code is the dreaded "Value optimized out" errors you often get trying to print values in gdb. When that happens under rr, you can nearly always find the correct value by doing "reverse-step" or "reverse-next" until you reach a program point where the variable wasn't optimized out.

I've found it's taking me some time to learn to use reverse execution effectively. Finding the fastest way to debug a problem is a challenge, because reverse execution makes new and much more effective strategies available that I'm not used to having. For example, several times I've found myself exploring the control flow of a function invocation by running (forwards or backwards) to the start of a function and then stepping forwards from there, because that's how I'm used to working, when it would be more effective to set a breakpoint inside the function and then reverse-next a few times to see how we got there.

But even though I'm still learning, debugging is much more fun now!

Sunday, 10 May 2015

Piha Night Walk

Yesterday, as a bit of an experiment, we did a walk out at Piha intending to see the sun set during the walk and completing the walk in the dark. It was a resounding success!

We started near Lion Rock around 4pm and walked north along the beach, then turned inland up the Marawhara Walk, White Track and Rose Track walking quite fast uphill and downhill to Whites Beach. We arrived around 5:40pm just after sunset (5:30ish), so we didn't quite get to watch the sun actually set from the beach as planed --- next time I'd start a little earlier. The view in the dusk was still lovely.

After that we walked out to Te Waha Point in the gathering gloom, then back to Piha Beach via the Laird Thomson track --- lights essential. After reaching the beach we turned our lights off and walked back south along the beach in darkness. This was really fun --- the beach is an easy walk in the dark, the night sky was surprisingly good (considering proximity to Auckland), and there's a narrow sea-cave near the north end of the beach that you can go all the way into to experience true pitch-darkness.

This was even more fun than I'd hoped. Most walks are best done in summer when there's plenty of daylight, but this walk is a great winter option --- after it gets dark you've still got time to drive back to town and have dinner together or whatever. Do bring lights and warm clothing --- it could be quite cold if windy and wet, though we were lucky with the weather. As a bonus it helped my children get used to tramping in the dark.