Sunday, 29 April 2007

Travel

I'm hitting the road again. On Monday night (April 30th) I'm flying to New York. Wednesday I'm visiting Google NY, and giving a couple of talks (one about Firefox, one about Chronicle). Thursday and Friday I'm at IBM Hawthorne at the OOPSLA PC meeting. Saturday I'm meeting friends in Westchester and travelling to Boston (perhaps via the infamous Chinatown Bus). Sunday I'm with friends in Boston. Monday I'm flying to the Bay Area; I should be at the Mozilla office late Monday through to Wednesday (discontinuously!). Thursday and Friday I'm a guest at a retreat for Berkeley's software research community in Santa Cruz. Saturday I'm with friends and then flying back to Auckland, arriving early Monday morning Auckland time, the 14th of May.

I will be online at least sporadically most days and trying to get as much work done as I can.



Thursday, 26 April 2007

Bethell's Beach

Today is Anzac Day. It was a perfect late autumn/winter's day in Auckland --- sunny, not much wind, cool, but warm under the sun. We wanted to go for a walk somewhere nice with friends and relatives, and since I haven't been out west for a while I picked Bethell's Beach. I can only remember going to that area once, and that was over a decade ago. We walked north around the hills overlooking the ocean, following the track that leads all the way to Muriwai and Goldie Bush. The track ascends from sea level steeply to the top of a ridge, where we had a picnic lunch with amazing views south all the way to the sandbanks of Whatipu (faintly visible in the photo below), and north probably as far as the Kaipara heads --- an amazing vista of the "Auckland west coast". Wrangling kids was stressful at the time, but already those memories are fading while the highlights remain.


Bethell's Beach south to Whatipu


Tuesday, 24 April 2007

History Based Stack Reconstruction

One thing I forgot to mention in the last post was history-based stack reconstruction. This is an idea that I came up with a while ago but I had to leave it out of my original Amber discussion. It was implemented in the Xulrunner prototype debugger that I showed screenshots of.

Traditional debuggers reconstruct the call stack for a thread by examining stack memory. You can pick out return addresses, saved registers, etc, and most of the time you can figure out what state each function was in before it called its callee. The problem is that a) this is very hard to do right and b) you can't always do it right. Anyone who's used gdb a lot is probably familiar with situations where gdb won't give you a stack trace, or will only give you part of a stack trace.

It's hard to do right because optimizing compilers like to mess around with the layout of stack frames, and they like to avoid storing things in stack memory at all. They also like to avoid using a register for the "frame pointer" (especially on x86 where registers are precious gold dust) so they don't always construct the linked list of stack frames that you read about in textbooks. Debuginfo formats like DWARF2 define hellishly complex schemes for encoding information about what the compiler has done so the debugger can figure everything out, but it seems that gcc or gdb or both frequently screw it up. That's not surprising since there's a lot of hairy specification to implement.

Sometimes even the best-implemented tools won't help. Consider tail call optimization, for example: when a function A ends by calling another function B, a commonly desired optimization is to overwrite A's stack frame with the stack frame for B, so that when B returns, control returns directly to A's caller. This optimization simply destroys the information you need to construct the true call stack. Consider bugs where the stack is accidentally (or deliberately!) overwritten with garbage: same problem.

Now suppose you have a Chronicle-based debugger with easy access to the entire history of the program execution, and you want to reconstruct the call stack for some timestamp T. Instead of messing around with the stack memory at time T, look in the history to see which functions have been called but have not yet returned. Those are the functions that are on the call stack.

To implement this, we need to define what it means for a function to return. This isn't obvious since we want to include things like exception unwinding. The definition I chose was that a function called at time T has returned when the stack pointer first becomes greater than the value it had at time T. Defining when a function is called is also not obvious since we want to include things like tail calls, where the control transfer to the function might be just a jump instruction. I chose to define a function call as any control transfer to a instruction labelled as a function by symbol table information, or any execution of an x86 CALL instruction --- unless the destination of the CALL is the next instruction. The latter is a common idiom for getting the current program counter onto the stack so you can pop it off into a register for PC-relative addressing.

The rest of the gory details are in the updated paper --- a PDF version is in the Chronicle tarball I linked to in the last post. It's really not all that hard. The hardest bit is probably that you need a way to efficiently answer the question "what is the maximum value of the stack pointer over the time interval from T1 to T2", which requires a small amount of additional instrumentation all the way back to the Valgrind tool. But in the end it works really well. It's a lot less effort than implementing the DWARF2 stack crawling spec, and it's incredibly robust. Since we don't look at the stack memory at all, we construct a correct call stack in the presence of tail calls, or even if someone has zeroed out the stack completely. Also we never need to determine from the current PC what the enclosing function is (something traditional debuggers do) so we're robust to wild jumps, and also to arbitrary code placement optimizations.

I think this is a good example of how designing a debugger around complete recording of program execution gives you the chance to simplify a lot of things and make them more robust at the same time.



Chronicle Released

I've finished my OOPSLA reviews (modulo a few small updates to come). So I stole some time today to finally finish off the Chronicle release. I eventually decided to go with mostly GPL and some BSD/X11 licensing. I reorganized the source tree a bit, added some comments especially around interfaces, and wrote a Makefile to build everything. I also added an automated test framework. It's pretty cool; "make check" compiles test programs, runs them in the recorder, and then runs test Perl scripts that query the database and examine the query results to make sure they look good. Right now there's only one test, a program that just calls a small function five times, and the test just verifies that the query engine finds five function calls. But it's a good start, and it means that even though there is no UI --- just bare infrastructure --- people can verify that the code is working and see how to get the tools running. The test script also shows how you can do potentially cool things by scripting the query engine.

Anyway, the project is hosted at Google Code. There is an initial code drop there. People should be able to download the tarball, unpack it, and run "make all check" on a Linux system and see the test program being recorded and the query engine running.

Thanks to Novell people --- especially Jared --- for helping make this happen!

I've got some ideas about how I'd like the UI to work in Eclipse. Just recently I figured out what I think is the right way to integrate a source code view with views over history. I don't know whether I'll ever have time to implement them though! Need to catch up on my real work :-).



Tuesday, 17 April 2007

MoCo All-Hands

The MoCo all-hands meeting last week was tons of fun, as expected. Like the Firefox Summit last year, I really appreciated being exposed to the scope of activities beyond my immediate horizon. This time I really enjoyed having other NZ team members around!

One thing that emerged from discussions at the meeting is that we will not be targeting the compositor work for Gecko 1.9/Firefox 3. It's too risky given that I will have to spend considerable time working on text bugs up to the release. I think this is the right decision, but the downside is that some important bugs will not be fixed for Firefox 3. In particular, the ability to place chrome over content will not be enabled.

Air New Zealand has upgraded their entertainment system on the 777s that run between Auckland and San Francisco. They now have over 50 movies available. Since I don't get to see movies much (it's expensive when you include babysitting), I took the chance to catch up a bit:


  • The Prestige: Excellent. Slightly disturbing, highly recommended.
  • The Departed: Excellent cops-and-robbers genre piece. Highly recommended.
  • The Last King Of Scotland: Excellent. Quite disturbing, mostly in a good way. Recommended.
  • Pan's Labyrinth: Excellent. Even more disturbing. Lukewarmly recommended. The trailers I saw highlighted the fantasy bits and left out the torture and murder; I don't know if I'd have bothered watching it if I'd had more accurate expectations.
  • Crank: Stupid.
  • Tenacious D: Really stupid. Also, less funny than an average Buffy episode.
  • All Blacks vs Australia, June 2000: All Blacks score three tries in the first five minutes. Australia scores four tries during the rest of the first half to equalize. I won't tell you how it ends. Two great teams at the top of their games --- the best rugby game I've ever seen.


Google Maps Fails Me

Best Buy International



Monday, 9 April 2007

Travel Plans

During the coming week I'll be away in Mountain View visiting the mother ship for the Mozilla Corporation all-hands meeting. That should be fun. I'm making this a short trip because a couple of weeks afterwards I have to make another trip to the USA for the OOPSLA program committee meeting at the IBM Hawthorne lab (my old stamping ground) in New York. It looks like while I'm in town I'll be visiting Google New York and giving a couple of talks (one about Firefox, one about the debugger). On the way home I'll stop over in the Bay Area to visit Mozilla and also attend the Berkeley software researcher's retreat as a guest (thanks chaps!). I would have preferred to do everything in one trip instead of having two trips close together, but the scheduling of the all-hands and the PC meeting would have forced me to be away for at least four weeks, which would just have been too much.

I'm grinding away on OOPSLA reviews. I've done seven papers so far, twelve to go. It can be tough but the papers are mostly quite good so I'm enjoying myself. The travel is going to interfere a bit but I should still get everything done on schedule, especially with the help of old friends and colleagues I've recruited to help review some of the trickier papers (thanks if you're reading this!). Over Easter I'm away with family, which is great --- when I get tired of reading papers, I can go to the beach, go for a swim, or just stare at the horizon and try to figure out what on earth the authors are thinking!!

All this activity does mean I'm falling a bit behind on bugfixing. The next couple of weeks don't look good for that but once the OOPSLA reviews are done I will be back into it. Now that my primary work environment is my laptop, it's easy to get work done even when I'm on the road. The main impediment is accessing Bugzilla via spotty or expensive Internet connections. Wouldn't it be great if Bugzilla could cache the data for selected bugs for offline use? Hmm...