Sunday, 30 December 2007

Our Rectangles Overfloweth

When we're laying out content in Gecko, we create an nsIFrame object for each CSS box for each element, as I've described previously. CSS actually defines several kinds of boxes: margin boxes, border boxes, padding boxes and content boxes, depending on exactly what part of each element is included. In nsIFrames we store border-boxes always, and can reconstruct margin, padding and content boxes on demand.

CSS boxes for an element do not necessarily contain the boxes for that element's children. However, there are some situations where we want to aggregate geometry information for an entire subtree of elements:

  • For "overflow:scroll/auto" elements we need to figure out how far we should permit scrolling in each direction. A reasonable thing to do here would be to compute, and possibly store, the smallest rectangle containing the border-boxes for all elements which have the scrollable element as a containing block ancestor. Note that for overflow:auto elements, we need to know this during layout because we need to decide whether scrollbars should be shown.
  • To optimize repainting, we want to be able to quickly test whether a subtree of content is responsible for painting anything visible in a given rectangle that needs repainting. An obvious solution is to store on some or all elements a rectangle which contains all pixels drawn by the subtree rooted at the element.

In Gecko we mash together both of these concepts into the single concept of an "overflow rect". Every nsIFrame has an "overflow rect" which is the smallest rect containing both every pixel drawn by the frame and its descendants, and also the border-boxes of the frame and its descendants. We do the obvious optimization of not storing the rect when it's equal to the frame's border-box --- usually true since overflow is still the uncommon case. Combining the two concepts simplifies code and reduces memory usage. It usually doesn't produce any noticeable effects, but there are some unfortunate and relatively rare situations where it does. For example, adding a CSS "outline" to the child of an overflow:auto element can result in a scrollbar being displayed just so you can scroll the edge of the outline into view.

One unfortunate thing is that right now every container frame's Reflow method is responsible for computing the overflow area for the frame by unioning together the overflow areas of all child frames. It's only a few lines per Reflow method but like all code duplication it's unnecessary bloat and it's fragile too.

Another unfortunate thing is that the "bounding box of drawn pixels" area can be expensive to compute, in particular because it means you have to retrieve the ink extents for every glyph.

In a perfect world, I think we'd separate these two concepts, and store 0, 1 or 2 additional rectangles for each nsIFrame. We'd compute "scrollable overflow areas" by walking the frame tree directly from scrollframes instead of spraying code through Reflow() implementations. We'd compute the "drawing overflow area" lazily, possibly even while painting; this would get the expensive glyph extents operations off the layout path, which would speed up layout flushes, e.g. when script requests geometry data via getBoundingClientRect() or offsetTop.

I've thought pretty hard about ways to avoid computing the "drawing overflow area" at all. It's certainly possible for content that remains hidden, such as a hidden IFRAME or background tab. Unfortunately for the common case of page loading you almost always load the page, render it, and then something changes on the page --- a caret blink, an animated GIF, incremental content load --- and that usually forces you to compute the "drawing overflow area" of most elements in the page, in case they overlap the area you need to repaint. For example we should get the extents of each glyph on the page, to see if there's an insanely large glyph that might overlap the repaint area. (As I've explained in other posts, currently, even on trunk, for performance reasons we fudge and assume that at small font sizes glyphs stay within their regular font boxes. Webkit does something similar.)

Sunday, 23 December 2007

Drive-By Debugging

My trunk dogfood build was feeling sluggish yesterday so I thought I should have a poke to see why. I'd noticed sluggish episodes on and off over the last week or two so I thought there might be some kind of problem. "top" showed firefox-bin using 75% CPU even when I wasn't doing anything in the browser, and there were no animations or other obvious activity going on in my windows, so something was definitely wrong.

Fortunately I always dogfood debug builds so it was just a matter of attaching gdb... The first thing I did was poor-man's profiling, just hit ctrl-C to interrupt the process several times and look at the stacks. It was quickly obvious that we were constantly in painting code. I fired up QuartzDebug and confirmed that my GMail window was being fully repainted many times a second.

The next step was to figure out what was triggering the painting. I set a breakpoint on nsIFrame::Invalidate, which is what most Gecko code goes through to force part of a window to be eventually repainted. That breakpoint wasn't being hit very often, so it wasn't related to the problem. I then looked for lower-level Cocoa APIs that could be being used to force the window to repaint, and discovered nsChildView::Invalidate calling those APIs. So I set a breakpoint in nsChildView::Invalidate. Bingo! That was being called very frequently, invalidating a 2000x2000 pixel area of the window.

The stack for that call showed the whole story; you can see it here. Basically the window was being invalidated by a Flash plugin instance requesting a repaint. The Flash plugin was responding to an "idle" event which we send to the plugin at a fairly high rate; each one of these events was causing the plugin to request a repaint of the window. I'm not really sure of the purpose of these events, but it's part of the NPAPI plugin API on Mac that we send them frequently. I'm even more unsure why Flash was requesting repaints of the whole window all the time; it was easy to see that the plugin's layout frame was only 100x100 pixels, and furthermore, the plugin was hidden. I think GMail only uses Flash to play sounds. I suspect there's a deeper bug here somewhere in the intersection of Flash, Gecko and GMail. Eep. It could be related to Quartz drawing model support that we added in the last year that has some other known nasty bugs (whether on our side or the Flash side, again I'm not sure).

Anyway if the plugin is hidden, we should be able to ignore its invalidate requests, working around this bug and maybe other addressing other performance issues. It turns out that the invalidate requests pass through nsPluginInstanceOwner, which has a mWidgetVisible field which is false in this case because we know the plugin isn't visible. It's a one-line patch to drop invalidation requests on the floor when this field is false, and that's what we're going to do. I couldn't be sure of reproducing the bug in a fresh Gecko build but I used a gdb conditional breakpoint with an associated command to simulate the fix in my running build, and it worked.

Episodes of unexplained sluggishness that are hard to reproduce are very frightening from my point of view, because they can really hurt the user experience but it's very hard to track them down; they don't leave stack traces and often there are no steps to reproduce so it's hard to write up a useful bug report. The moral of my story, I suppose, is to dogfood a debug build and if it seems to suddenly be afflicted by a problem like this, be willing to attach a debugger, collect some stacks and do whatever else is necessary to track down the problem ... because you may be our best chance to find and fix the bug before millions of users hit it.

Saturday, 22 December 2007

Official Notice

David Cunliffe was New Zealand's Minister of Telecommunications, IT and Immigration until he recently switched to Health. (Personally I would have stuck with IT!) I just found his speech at the Open Source Awards on October 18. There's actually a great mention of our local Firefox operation!

Just imagine for a moment our world without open source.

More than two-thirds of the websites out there would simply disappear. But then you probably wouldn't be able to find the remainder because most of major search engines use open source. And if you did, you wouldn't be able to use what has become many people's favourite browser, Firefox: a piece of software developed in large measure by New Zealanders and right here in New Zealand.

Cool! Well, "large measure" is probably a bit of an exaggeration but I'll let him off!


One year ago I left Novell and joined MoCo full-time to start the Auckland office. It's been quite a year and overall I think things have gone well. We've got some good people on board, set up a nice little office, had our first intern, and most importantly, got a lot of work done. For many of us it's been a bit of a grind as we try to fix a lot of bugs before shipping Firefox 3. Hopefully next year we'll all have a chance to do more focused and interesting projects, new features and stuff like that.

But overall I'm having a great time --- working on something I'm really passionate about, face to face with a group of really smart people who I can talk to and work with and eat lunch with, at home in New Zealand! It's exactly what I've dreamed about for a long time. I've had to learn a little bit of management and a little bit of office logistics, which has been tedious at times but these are good skills to be learning.

I'm looking forward to next year. It's going to be challenging on many fronts, but Firefox 3 is looking good, Mozilla has momentum and we have the opportunity to do many exciting things in Gecko and outside it too. During the 1.9 cycle we developed some great test infrastructure that will help us move faster on the trunk and also help us maintain better branch stability for Firefox 3 maintenance than we had for Firefox 2/Gecko 1.8.1. Moving to Mercurial will help us make more changes in parallel. We also have a lot more (twice as many?) full-time developers than we had at the start of the 1.9 cycle, developers who've been climbing the learning curve and who are ready to storm ahead. This is good because we have a lot of work to do, a lot of areas that need fixing and a lot of items on our wishlist.

Here in the Auckland office I hope we can continue as successfully as we have been operating so far. We have space for a few more people! University graduate hiring was disappointing this year, we didn't find anyone who was really suitable. Very few students are exposed to low-level (C/C++) programming at a significant scale, and bright students are either surprisingly unambitious or their ambitions lie in creating the next hot Web startup. That's OK but I find browser development fundamentally more compelling --- we get to make the rules of the Web --- and I'd like to find more people who think that way. For now we've been successful finding oddball self-taught hackers, hopefully there are more out there!

Sunday, 16 December 2007

Barcamp Auckland

I spent the whole day today at Barcamp Auckland. I had a great time talking to Web developers and users about what they do and about Firefox, Gecko and Web standards. I had a particularly interesting conversation with a guy who builds MythTV boxes, about the situation with media codecs and free software.

As usual, I really enjoyed talking to Nigel Parker from Microsoft. The cone of silence around IE8 puts him in a tough spot at a conference like this, but that's not his fault. I was surprised that his talk hardly mentioned Silverlight, he seemed to be most interested in multitouch screens, which is fine I guess although it's not what keeps me awake at night :-).

I gave two talks, one about "The Future Of Web Standards" and one about "What's New For Developers And Users In Firefox 3". Both were well-attended. I burned up most of Friday putting together my FF3 demos so I only started working on the actual talks about 1am this morning. I was exhausted so I just wrote down just a few headings in the hope I could flesh them out on the fly. I'll have to read the blogs to see how successful I was but from my perspective it went about as well as a normal talk. I'd really like to develop the ability to talk with an absolute minimum of slide support.

"The Future Of Web Standards" basically went like this:

  • Standards are hard Creating and implementing standards is harder then just shipping whatever you feel like...
  • Standards are necessary But the alternative to standards is single-vendor domination, which would be extremely dangerous in the case of the Internet.
  • Clean up existing standards We have to improve the completeness and accuracy of existing standards...
  • Add new features But we also have to add new features or the open Web will become obsolete.
  • WHATWG/HTML5 Addresses both of these goals.
  • Video Video should be a first-class HTML element, but codecs are a problem.
  • ES4 Javascript must evolve to become a powerful and complete language, but there is opposition to this.
  • Politics Organizations are always trying to game the system to further their own interests, often at the expense of the open Web.
  • Your needs I solicit feedback about where Web developers are feeling pain and what they desire.
  • HTML matters HTML retains a position of strength because it's the only universal container format. HTML can contain anything but Silverlight and Flash are leaf elements.

"What's New In Firefox 3" went like this:

  • Who am I?
  • What is Mozilla? Not a Web browser company, but a nonprofit supporting the open Web and its users.
  • What is the open Web? A Web that allows anyone to participate in implementing and using, that is not dominated by a single player.
  • What is Mozilla doing in New Zealand? Gecko development!
  • Hiring Strong C++ developers: the open Web needs you
  • Demos: Pushing the Web forward Lots of demos.
  • Use this stuff if you care about open Web The best thing that Web developers can do for the open Web is to use Web standards, particularly new-feature standards, even (or especially) those not yet supported by IE. This won't always be possible when working for paying clients, but take advantage of any opportunities; graceful degradation helps. Adoption is the only way to pressure Microsoft to implement this stuff.

I demoed some Gecko features:

and some Firefox UI features:

The HTML/SVG containers demo actually shows a couple of problems. First, dragging the round window handles doesn't work when the mouse is outside the handles and over the child document, because the child document gets the events and they don't bubble out. I'm not sure what to do about that. Also, putting more than one page in the demo doesn't work quite right because SVG has no z-order control, to the photo-table code moves things to the front by removing them from the DOM and re-adding them at the top, but removing and re-adding an IFRAME reloads it, which sucks in this case. I'm not sure what to do about that either.

During the Firefox talk I attempted to demo Vlad's 3D-canvas extension but screwed up because I was using a debug build and I had built the extension for an optimized build. (Stupid me, I should have used an opt build for the demos, they would have looked smoother.) So right at the very end of the day I snagged a demo slot to show the teapot and GPU-based fractals.

The 3D canvas is sensational; I hadn't tried it before last night. (Vlad's prebuilt extension no longer works on trunk, but if you're making your own trunk builds you can configure with --enable-extensions=default,canvas3d and then install dist/xpi-stage/canvas3d.xpi into Firefox and it should work.) It's really great how his JS support code pulls GLSL programs straight from the HTML page; you get the "edit; save; reload" Web development cycle for GPU programming. I think this could be an excellent way to learn GPU programming; imagine an online tutorial with a bunch of starter examples.

Also, if we can make it secure for untrusted content, it addresses JS performance issues head on (for a certain class of problems): forget fancy JIT compilers and runtimes, just stick shader code in your Web page and we'll run it right on the GPU for maximal performance! Woohoo! The fractal demo is a superb example of this: a regular C program couldn't nearly keep up with that Web page.

Anyway, after a series of late nights and today's work I need a rest! I plan to take the first half of Monday off.

Standards are hard,

Wednesday, 12 December 2007

Video Wars

Cory Doctorow on BoingBoing has an eloquent survey of the battle lines over codecs for Web video ... a battle which is starting to heat up. Chris Double is over in San Jose right now preparing for the W3C Video on the Web workshop where this will no doubt be a critical issue.

We have here a culture clash. On the Web we have more or less established an expectation that standards will be implementable royalty-free. Attempts to introduce royalty-bearing standards are shot down or worked around. Audio and video standards, on the other hand, have a tradition of patent encumbrance and licensing pools --- not to mention DRM attempts. Now these two worlds are colliding.

My personal opinion is that DRM is an expensive and futile exercise. DRM schemes promote monopolies, hamstring innovation, and exclude free software. Moreover the experiment has been tried and it has failed, as the music industry seems to be acknowledging. Mere patent encumbrance isn't as bad as DRM, but it's still a problem for free software and truly open standards.

The good news is that browsers can support more than one codec. The W3C and others who favour an open Web should promote unencumbered codecs as a baseline, which today probably means Ogg Theora and Vorbis. Then everyone will have at least the option of free (both senses) production and consumption of media. Whichever vendors are willing to pay patent taxes can also offer encumbered codecs, and I suppose big media companies will be able to continue their DRM attempts that way.

Monday, 10 December 2007

Video Star

PC World has an article about standards-based video-on-the-Web with a few quotes from our own Chris Double. The quotes sound pretty good, congratulations Chris on surviving your encounter with the media.

It's too bad PC World NZ didn't pick up on the local angle. I guess they're just serving up a syndicated story, and they probably would not even have known Chris is based here in Auckland anyway. (Big hint to any media types who might catch this blog!)

Saturday, 8 December 2007

Interpreting Failure

Big news in New Zealand today is Graham Henry's reappointment as coach of the All Blacks. This is big news given that the All Blacks recently fell out of the four-yearly Rugby World Cup at the quarter-final stage. It's the first time ever a coach who failed to win the World Cup has been reappointed. Remarkably, the majority of public opinion backs his reappointment.

I think it's a great rejoinder to all those who think NZers are over-obsessed with rugby and World Cups in particular. I also think it's remarkably rational to concede that results are sometimes outside the coach's (or anyone's) control and that statistics are more meaningful than one-off results.

Naturally, there are a lot of misguided people complaining both that Henry was too focused on the World Cup and that World Cup results are everything so he should have been sacked.

Small Mammals

Ian Hickson writes interesting stuff about the evolution of companies. One quibble: what he describes is "evolution" only in a very loose sense. Evolution, even non-biological evolution, generally requires reproduction with heritable characteristics that affect fitness, and that doesn't seem to be present here. All we have here is adaptation IMHO.

That aside, he's basically right: people have learned to adopt business models that can't easily be crushed by Microsoft, but they all have the weakness that a superior Microsoft product would hit them hard.

He omits, though, that there are other things Microsoft could do (and is trying to do) that would also hit their competitors hard --- for example, patent assaults. And open-source isn't as immune from a cut-off-their-air-supply attack as I think he makes out.

Friday, 7 December 2007

Script Safety Annotations

I've been encountering a trunk bug where we make a call to an XPCOM method that happens to be implemented in Javascript. Unfortunately we make this call while we're tearing down a DOM element, during garbage collection. Running JS code during garbage collection is a very bad idea, so we crash.

This is a specific case of a general problem: some code is not safe to call at certain times. I would like to be able to add checkable annotations to functions and methods that indicate the following:

  • The function may spin the event loop or execute any script (these are equivalent).
  • The function may execute trusted script but will not spin the event loop or execute untrusted script.
  • The function will not execute script or spin the event loop (default, so dangerous methods have to be annotated).

We would need a way to specify these attributes on XPCOM IDL interface methods. (It would also be useful to have syntax for setting a default annotation for all methods of an interface.) Most of the scriptable Gecko API methods implemented in C++ could use these annotations. This might help Taras' refactoring work. It could help code analysis tools quite a lot by constraining what can run. I think it would help us catch a lot of bugs where we make unsafe calls.

We would want an escape hatch to "cast away" annotation checks, where some dynamic test guarantees that it's OK to call nominally less-safe code.

Saturday, 1 December 2007