Tuesday, 30 October 2007

Linux Matters, Addendum

There's another thing I wanted to talk about but which really deserves its own blog post: the relationship between XUL, the Web, and free desktops.

First, I think XUL and the Web are both great for free desktops. They attack the "applications barrier to entry" that has made Windows too strong for too long.

Second, XUL does not compete with free desktops. For one thing it's very much open source, and furthermore, it's a toolkit and sits above GNOME etc.

Third, the distinction between XUL and the Web will erode over time, in the sense that the features you need XUL for today will eventually be offered as regular Web standards (CSS box model, enhanced widgets, offline execution, WHATWG APIs, etc).

Therefore the interesting question is not "how do we best integrate Firefox with GNOME?", but "how do we best integrate GMail (etc) with GNOME?" We've got some ideas but there's a lot more that could be done.

Monday, 29 October 2007

Linux Matters

I've been reading some dissastisfaction with the state of Firefox and Gecko on Linux. I feel I should comment because for a couple of years I was Novell's "Firefox guy" and did significant work on Gecko/GNOME integration.

I take exception to comments like:

However it geniunely seems that Mozilla does not care about Linux the slightest.

The truth is that during every development cycle we sink a ton of effort into making sure that Gecko and all its features work well on Linux. In 1.9 we've done lots of work to integrate with Pango, fix cairo-on-X issues, make sure <video> works, wrangle build-system and compiler issues, etc. If we didn't care about Linux, we'd just drop support for it. That would help us get Mac and Windows releases out faster, and after all they're where most users are. Fortunately we don't make decisions purely for economic reasons. I'd like to see people recognise that.

Part of the problem is that people are using Gecko 1.8.1 which is over two years old now. We'll ship Gecko 1.9 soon which has a ton of good stuff for Linux users:

  • Michael Ventnor (awesome volunteer contributor) has fixed a bunch of GTK theme bugs and turned on GTK themes for HTML form widgets --- very sexy. (MoCo is hiring him as an intern to carry on working on these and other issues during the summer.)
  • Teune van Steege (another contributor) has implemented GTK theme support for tree headers, which fixes the most important remaining non-native-looking UI element.
  • We've ripped out our own cross-platform drawing code and are now using cairo for everything. Along the way we've made a lot of contributions to cairo. We're trying really hard to be good community players here.
  • We integrate nicely with Pango now, using it to shape all complex scripts and also Latin text in many situations. By tweaking one pref you can disable the non-Pango fast path and get Pango for all text.
  • Thanks to Theppitak Karoonboonyanan and other contributors we use Pango line-breaking for scripts that have extraordinary line-breaking needs such as Thai, Lao, and Tibetan.
  • Karl Tomlinson (MoCo) designed and implemented "windowless plugin" support for X so that Linux can have platform parity with Mac and Windows users when Adobe supports it in their Linux Flash player (or if swfdec beats them to it!).
  • We've landed in 1.9 some of the GNOME integration work that I did for Novell, in particular startup notification support and dbus-based NetworkManager integration. (Some of the other work I did, like improved GConf integration has still to land ... just needs someone to push it in.)
  • Update Aaron Leventhal (IBM) and many other contributors have done a ton of work to make Firefox and the entire Mozilla platform accessible on Linux. This includes helping out with Linux-specific accessibility infrastructure and assistive technologies such as Orca.

There's definitely more we could do. I've been soliciting volunteers to implement RGBA translucent windows for X, for example, and it would be nice to support the GNOME print dialog. But I think we've made good progress.

One observation is that the number of Mozilla developers who are Linux users has dropped quite a bit as many of us (including me) have switched to Macs. I think this is partly due to Apple's hardware and software being very good and partly due to the fact that because Apple is evil, you can virtualize Linux on Mac but not the other way around. No doubt this has impacted Linux support a little bit. I still think that Linux is ultimately very important to Mozilla because proprietary client platforms will always pose a threat to the open Internet; the client vendors will constantly be trying to find ways to extend their control into the network. (And of course an open Internet is crucial to the survival of open clients, which is one of the main reasons I do what I do.)

Let me just address a few other complaints while I'm here:

  • The list of things Firefox could do to improve integration with the underlying environment goes on and on

    I'd like to see that list. Now that we have more heads, we might get some of those things done within MoCo, and I'd be more than happy to help other contributors get them done.
  • We have done a lot of work on memory consumption, some with help from Federico Mena-Quintero. We know this is a big deal. We're getting better in this and other areas and we're going to keep getting better.
  • Webkit is a good engine and getting it running in GNOME is a good thing, but be careful about comparing a two-year-old engine you use all the time (and hence see all the bugs of) with something that isn't quite there yet. Also, I would hate to see Apple with too much power over the free desktop ... keep options open.

  • WebKit ... happily uses GTK the way it was meant to be used instead of doing crappy hacks like cloning several instances of one widget and overloading its painting context.

    I'm not sure if this person knows what they're talking about, but if the Webkit/GTK port is using real GTK widgets in Web pages, then either they're doing some wizardry that I don't understand or they're going to have serious problems rendering content like this:

    (This demo might be mangled by feed readers etc.) Even Webkit on Mac and IE on Windows don't use real native widgets. I agree our theming approach on GTK is a crappy hack; I've been begging for years for a real GTK theme-drawing API we could use, like Mac and Windows provide.

Update Here's a screenshot of the Bugzilla query page in Firefox 3 trunk using Clearlooks:

Shiny Bugzilla

Friday, 26 October 2007

Growing Pains

Our NZ team is growing (hi Chris, Matt!) so we're searching for new office space. This is proving traumatic.

I got in touch with the operators of a building downtown. The location was perfect, terms agreeable, good space, everything looked fine, but then another prospective tenant came along and offered to sign a three-year lease; we can only sign for one year, so we lost the race. But then there was another smaller but still agreeable space in the same building where the tenants were delinquent, incommunicado and on the fast track to having their lease terminated. So we waited a bit more, and then on the very day their lease was to be terminated, the tenants reappear and want to carry on, so we lose again.

All this has happened at glacial speed so a couple of months have gone by without results. At least I know what we're looking for now (70-100 square metres, open plan, one-year lease, downtown) and a price range.

I've called a bunch of real estate agents and either they don't call me back, or they don't deal with offices this small, or they don't deal with one-year leases, or they offer me places which are totally hopeless. Yet I know what I'm looking for exists, because I've already found it on my own (almost)!

I found some listings on Trademe but they're mostly unappealing or stale, or direct me to real estate agents who don't call me back.

Today I wandered around my target area and noted down all the buildings that seemed to contain the sort of small leased offices I'm looking for. There are lots, about 18 actually. But only a few of them had contact information on display, and despite my best Internet search efforts, I'm unable to locate building management for the others.

I wonder how this is supposed to work. Are there real estate agents who are actually helpful? Is there a listing service or classified ad network that everybody uses, that I just don't know about? Am I deluded about the availability of the sort of space I'm looking for? Or does the market have the liquidity of granite?

I'd rather be fixing frame construction bugs!

Wednesday, 24 October 2007

Abstraction Penalties, Stack Allocation And Ownership Types

Sometimes, cleaning up your code makes it slower even when it shouldn't. For example, you might extract code from a function and make it a separate function. Maybe the compiler should inline your code, but it doesn't, or it does but the result still isn't the same as the original single function. This is an abstraction penalty: you use the programming language to structure your code with additional abstractions, and you pay a performance penalty.

Grouping variables into structs or objects is a common source of abstraction penalties. For example, converting local variables nscoord x, y; to nsPoint pt; may prevent the coordinates from being placed in registers, in some compilers. Or the coordinates may need to be placed in memory if you pass an nsPoint to another function instead of two nscoords.

That aside, there's a much worse abstraction penalty associated with objects in common "safe" languages such as Java or C#: they require heap allocation and garbage collection. (I'm ignoring C# structs for now for reasons that will hopefully become obvious.) And despite what people will tell you, garbage collection is still slower than stack allocation, often by a lot. When I was doing serious Java programming a few years ago, the way to make Java code fast was still to avoid using objects or manually recycle them to avoid allocation and GC costs. And the problem is that a whole lot of language features are tied up with objects --- interfaces, casting, information hiding, code reuse. I suspect one of the biggest performance issues switching from C++ to Java or C# may be that in C++ you can stack-allocate objects, but in Java and C# you can't. Just look at all the places where Gecko uses nsAutoString or nsAutoTArray to good effect.

The problem is of course that in a safe language, you can't just hand out pointers to stack-allocated objects like candy, because someone might store one away in a dark place and then later, long after the method has returned, start using it as a dangling pointer. This cannot be allowed.

What the world needs is a type system that allows references to stack objects to be given out, but restricts their usage so that no-one can use them after the objects are destroyed. Fortunately many such systems exist. A whole family of "region calculi" have been developed, for example. However, I prefer "ownership types" as a more intuitive and generally useful framework; in particular I like Boyapati-style "explicitly parametric ownership types".

These work by adding an "ownership parameter" to each class type. These parameters are a lot like generic type parameters, but they denote the "owner" of the object. The basic rule is that an object cannot outlive its owner. This rule is enforced by the typechecker. Here's an example of how this can work:

class Vector<O@P> {
(O@P[])@this array;
int size;
Vector() { array = new (O@P[100])@this; }
void add(O@P o) { array[size++] = o; }
class Iterator<O@P, Q> where Q <= P {
Vector<O@P>@Q inner;
int i;
Iterator(Vector<O@P>@Q inner) { this.inner = inner; }
O@P next() { return inner.array[i++]; }
void f() {
C@method c1 = new C@method(), c2 = new C@method();
C@method c3 = g(c1, c2);
C@R g<R>(C@R c1, C@R c2) {
Vector@method v = new Vector@method<C@R>();
return new C@R();
void h<R,S>(Vector<C@R>@S v) where S <= R {
Iterator@method iter = new Iterator@method<C@R, S>(v);
while (C@R c = iter.next()) { ... }

This code creates two objects, puts them into a standard growable vector class, then passes the vector to another function which uses a standard iterator to iterate over the vector. What's cool is that the compiler knows a lot about the lifetimes of these objects. In particular the objects which are @method, i.e., owned by "the current method", cannot outlive the method --- even though we pass these objects to other methods and even store references to them in other objects.

These properties are enforced in two ways. The main constraint is that code can only refer to objects whose owners are in scope. For example, you cannot declare C@R glob; and set glob = c1; in g to "leak" a reference to c1 --- the correct R is not in scope at the global level.

Similarly, an object like Vector that takes ownership parameters can only take parameters that "outlive" the object itself. This guarantees that fields of the object cannot become dangling references. In some cases, such as Iterator, we need to add "outlives" constraints to its ownership parameters; Q <= P would be read "P outlives Q". The semantics of ownership mean that "P outlives method" for all in-scope ownership parameters P.

Given the code above, the compiler could, using only local reasoning,

  • Stack allocate c1, c2, v and iter
  • Allocate space for g's result in f's stack
  • Inline the array object directly into Vector --- the compiler knows that the array object cannot outlive its Vector, and because there is only one assignment, it cannot die before Vector either.

Therefore the above code requires no heap allocations at all. ("Inlining" objects into their containing objects is another important optimization, much like stack allocation, that common safe OO languages do not support.)

It's important to note that although a really good compiler could discover these optimizations in Java, for example, using escape analysis, there are major problems leaving it up to the compiler. The primary problem is that escape analysis is expensive and fragile and compilers don't do it very well. A secondary problem is that if you change your program so that escape analysis no longer works, the only way to tell is that your performance gets mysteriously worse. With explicit type constraints, you get a compile error instead. That's especially useful here because quite often when you create objects in a method you expect them to be method-local, and it would be great for the compiler to check that for you and give you an error if you're wrong (instead of generating code that's slow in Java or crashy in C++).

Another note is that although the annotation burden appears large, the code can be slimmed down considerably using sensible default ownership parameters and other shortcuts such as local type inference:

class Vector<O> { // O contains @P
O[] array; // fields default to @this
int size;
Vector() { array = new O[100]@this; }
void add(O o) { array[size++] = o; }
class Iterator<O, Q> where Q <= O@ {
Vector<O>@Q inner;
int i;
Iterator(Vector<O>@Q inner) { this.inner = inner; }
O next() { return inner.array[i++]; }
void f() {
// local objects default to @method
var c1 = new C(), c2 = new C();
var c3 = g(c1, c2);
C@R g(C@R c1, C@R c2) {
Vector v = new Vector<C@R>();
return new C@R();
// implicit type and ownership parameters
void h(Vector v) {
var iter = new Iterator(v); // infer type parameters
while (var c = iter.next()) { ... }

There are even more tricks we can pull! For example, we know a @method object can't be accessed by other threads, so we can easily remove locking from it. In a transactional memory system we would know not to log accesses to it. When a method containing a @method object returns, we know the object must die, so we can deterministically run a destructor at that point, giving us the handy C++ pattern of using destructors for general cleanup on method exit. And when an object is about to go out of scope, we can often easily prove that the method's reference to the object is the only outstanding reference ("uniqueness"). In that case we are allowed to safely change the owner of the object to something else (which is never normally allowed)! This may require copying, depending on what optimizations have been performed. For example:

C@R f<R>() {
var c1 = new C@method(), c2 = new C@method();
var c = ... ? c1 : c2;
// may copy the object from our stack to elsewhere
return (C@R)c;

Tragically, this stuff doesn't really exist yet in a usable language, nor has anyone proven it to work programming at large scales. Sigh...

Tuesday, 23 October 2007

Backing Store #2: Dependency Tracking

I had another lovely Labour Day weekend away with my family at the beach. Naturally my thoughts turned to browser engines and again I'll take the liberty of recording them here...

One of the reasons Web layout is difficult is that you have to handle incremental layout efficiently. That means whenever your layout code accesses style data (e.g., GetStylePosition()->GetOffsets()), you need to think about what happens if 'left' changes. How do you guarantee that the layout will be updated?

In Gecko, this is mostly handled using "style change hints". When style changes, we compare the old and new styles and produce a set of bits summarizing what needs to be changed. One bit indicates that the frame (i.e. layout object) associated with the element needs to be reflowed. When a style change finds that bit set, we mark the associated frame "dirty" and eventually a reflow will happen for that frame.

The problem with this approach (and other similar approaches that we use for incremental layout and rendering) is that it decouples the code that consumes style data from the code that causes updates when that style data changes. This is fragile. For example, if you add new code that reads style data during layout, you'd better make sure that the style change hints are set correctly for when that style changes (and this code will be in an entirely different place), otherwise you will have subtle incremental layout bugs. It's also inflexible because it forces a strict pattern of dependencies; the layout of a frame can depend only on the style of its own element; if you want to look at the style of other elements, that's just not supported and you'll have to hack around it with more fragile code.

It would be helpful to have a technique that let us write only on the code that consumes style data, automatically ensuring that sufficient updates are issued when that data changes. Actually we would want this technique to extend beyond style data to other data sources too.

(What I'm asking for is really a form of reactive programming. I unfortunately don't know much about research in that area, except that it seems to be mostly conducted in specialized domains with exotic languages such as Haskell and Esterel, so probably not of much use to us.)

Of course the hard part is performance; we'd want a solution that's as least as efficient as what we currently do. For example, a solution that creates a pile of explicit one-way constraints, one per dependence, would be hopeless.

So here's one idea. Define an Updater class representing a no-arg function that will cause a computation (say layout of an element) to be re-executed in response to a change in underlying data. Make style data getters take an Updater parameter:

class Updater {
virtual void Update() = 0;
class nsIFrame {
nsMargin GetStylePositionOffsets(Updater* updater);

The contract is that when the offsets change, the updater will be invoked. I've moved style data access from a style object directly to the frame, because changes to the frame's style need to be tracked, not changes to the style context itself (it's basically immutable in Gecko). GetStylePositionOffsets will record a set of updaters that depend on the offsets of this frame. The caller will do something like:
class ReflowUpdater {
ReflowUpdater(nsIFrame* frame) : mFrame(frame) {}
void Update() { mFrame->PresShell()->FrameNeedsReflow(mFrame); }
nsIFrame* Frame();
nsIFrame* frame;
nsBlockFrame::Reflow(...) {
ReflowUpdater* updater = new ReflowUpdater(this);
nsMargin offsets = GetStylePositionOffsets(updater);

Now it's impossible, or at least hard, to depend on style data without the dependency being recorded.

This would of course be appallingly inefficient, so how can we make it efficient? We can adapt Gecko's current approach. Let's make a rule that changing certain style properties, such as the offsets, will always force the frame to be reflowed. In that case we don't need to record the dynamic dependency. If we're a little bit clever we can avoid almost all run-time overhead. First let's index all the style properties with integers and then have a table that tells us if we need to reflow when that property changes:

enum nsStylePropID { ..., POSITION_OFFSETS, ... };
const PRPackedBool styleChangeForcesReflow[];

We delay realization of dynamic Updater objects until they're really necessary:
class Updater {
class Realized {
virtual void Updater() = 0;
virtual Realized* Realize() = 0;

Then we refactor the style getters:
class nsIFrame {
template <class U> nsMargin GetStylePositionOffsets(const U& updater) {
TrackDependency(updater, POSITION_OFFSETS);
return GetStylePosition()->GetOffsets();
void TrackDependency(const Updater& updater, nsStylePropID style) {
... add updater->Realize() to dynamic dependency storage ...
void TrackDependency(const ReflowUpdater& updater, nsStylePropID style) {
if (styleChangeForcesReflow[style] && updater->Frame() == this)
TrackDependency(static_cast<const Updater&>(updater), style);

Assuming the compiler can constant-fold the load of styleChangeForcesReflow[style], this becomes very low overhead. When properties are accessed in a scope where the compiler knows updater->Frame() == this, there is no overhead.

So what's gained? It's now much harder to screw up, and we support more complex dependencies (such as dynamically conditional dependencies, and dependence of one frame on another's style) along with static dependencies in a uniform API. The hard-coded dependencies are now just an optimization that we can easily meter and tune. Similar tricks can be used to track dependencies of painting and other activities.)

Other things you'd want to do when implementing this: support revocation of outstanding ReflowUpdater::Realized objects when a frame is actually reflowed; have the style change processing actually use the same styleChangeForcesReflow array to decide whether to reflow frames.

There is extra complexity in some places, so it's not a no-brainer. Just something to think about.

Saturday, 20 October 2007

A Tale Of Two Zooms

Daniel Glazman is concerned about the behaviour of our zoom implementation. In particular he's bothered by the fact that zooming can change the layout of a page.

This is intentional. There are actually two ways you can implement zooming:

  1. Just scale everything by the zoom factor when drawing, and leave the page layout unchanged. This is what Daniel wants.
  2. Try make the page look as good as possible when zooming, by changing the layout as necessary. This is what we have implemented.

One key difference between the approaches is the way text hinting is handled. When you draw text at a certain size, a good font engine will "hint" the glyphs, e.g. make sure that the vertical lines of an "n" are aligned on pixel boundaries, and this can change glyph metrics. For example, the width of a hinted "n" in 20pt is not necessarily twice the width of "n" in 10pt. So, if we want text to look its best, we must allow the advance width of text to change non-linearly when we change the zoom factor, and that can change layout in arbitrary ways. If we insist on preserving layout, we have to have to use unnatural glyph advance widths, which can't look as good.

Another difference is the handling of native widgets. They may not look good or work well when scaled to odd sizes, so we might not scale them or scale them differently, if we're free to modify the layout.

Another difference is how the available width is handled. If you insist on preserving layout when zooming in, you will almost always force the user to scroll horizontally to see all the content, or when zooming out in a desktop browser, the full width of the window will not be used.

Furthermore as a practical matter we currently have no foolproof way to scale rendering of plugins without actually changing their size, so we can't implement approach 1 for plugins.

For all these reasons I think exposing the second zooming approach in the Firefox 3 UI is the right way to go.

However, Daniel is absolutely right to point out that this is not what you want for a "zoom out and pan" UI in a mobile browser --- or a "magnifying glass" UI, for that matter! For those, it is important to preserve layout. Fortunately, implementing them is easy in principle, now that we're based on cairo; just scale the transform when drawing. There are a few problems, such as handling plugin rendering as I mentioned, but those can be hacked around or fixed. Some work would need to be done to expose this zoom mode as an additional prescontext/presshell API. I'm sure that in the future we'll offer both modes (to embedders and XUL apps, if not to Firefox users). (You can actually mock up this approach today by embedding an <iframe> in an SVG <foreignobject> and scaling that, although there are some problems.)

Monday, 15 October 2007

Exploring Auckland

I've developed a habit of taking our family off on a whim to explore some park or reserve in Auckland because it looks interesting on the map. Yesterday we explored parts of Greenhithe. The primary goal was to check out Taihinu Historical Reserve, which should have a view across Hellyer's Creek to the shores of Beachhaven where I grew up. I found the head of the track down to the reserve (which someone has obscured with a makeshift wire fence, grrr), but the track was a bit steep, overgrown and too wet to tackle yesterday. We'll come back at a more suitable time.

Then we visited Marae Reserve at the end of Marae Road. My map suggests it has tracks heading north and south along the beach but as far as I can tell they don't exist, it's just a clifftop lookout with a nice view over Herald Island. Next stop, a shoreline reserve at the end of Kingfisher Grove. This has a nice view up Lucas Creek, but again, tracks shown on my map don't exist.

So we went back along Roland Road and found a walkway that cuts across a small creek to Churchouse Road. It's in a reserve used to graze pet sheep, and a few were gamboling around. Also backing on to it was someone's paddock with a large pig and some chickens. The kids enjoyed watching the animals and there were a lot of blooming lilies to enjoy too. At Churchouse Road the walkway ends at a playground in Wainoni Park. This playground is quite interesting --- including a maypole, a tractor, a very wide slide, a huge roundabout, and an installation of channels and gates for playing with flowing water. Most intriguing is a set of large plastic cups on stems set in the ground at an angle of about thirty degrees from the vertical. You can sit in them and spin around. The intriguing part is that if you just set yourself spinning gently then through no apparent effort, the spinning seems to speed up over time! It's a very convincing illusion of violating conservation of energy. I think that the secret is that because of the incline, as you spin you unconsciously try to straighten yourself vertically, and this motion drives the spinning somehow. Anyway, definitely worth checking out! Be careful with your kids though, because the spinning can be a runaway process...

Friday, 12 October 2007

Welcome To My Backing Store

In an earlier post I raised the question of whether a truly radical redesign of the Gecko frame system could be developed incrementally (which is almost the same as asking whether it can be developed at all). I think I figured out how that could be done.

What I would do is bolt on the new layout subsystem in parallel with the old one. So basically as well as running the frame constructor to build and update frame trees in response to DOM changes, and restyling and reflowing those trees, we could also have another DOM listener that builds and maintains some other kind of structure, and hooks to run layout in that structure. Initially all rendering and other uses of layout would continue using the frame tree. With this approach, we'd always have a working browser, and it would be possible to gradually extend the second layout engine to handle more and more element and styles, and to measure its space and time performance and compare them to the existing layout, and even to check its results for consistency with the existing layout. Then at some point we'd make it possible to render using the new layout, possibly even selectable at run time. Then we'd convert over other users of the frame tree and finally stop buildling it. This approach also has the advantage that the riskiest and most uncertain work --- how easy is it to fit the requirements of real-world Web layout into the new model, and how does it perform --- is tackled first.

I'm not saying we should do anything like this. I'm mainly leaving a note for myself in case this becomes important later.

Thursday, 11 October 2007

One More Reason Why We Need Audio (And Video) In The Browser

Via Slashdot

But the content experience on the Web is crap. Go to Aquarium Drunkard, click an MP3. If you don’t get a 404, you’ll get a Save As… dialog or the SAME GOD DAMN QUICKTIME BAR FROM 1995. OMFG. ARE YOU KIDDING ME? THIS IS ALL WE’VE ACCOMPLISHED IN 15 YEARS ON THE WEB? It makes me insane.

So we have media consumption experiences with no context (desktop media players) and an incredible, endless, emergent contextual experience where media consumption is a pain in the ass, illegal, or non-existent (the Web). FIX IT. Your fans are pouring their music-loving hearts into blogs, Wikipedia, etc and what tools have you given them to work with? Not much, unfortunately.

Tuesday, 9 October 2007

Fish, Water

Surprisingly, I will be giving a talk tomorrow at the Auckland University Bioengineering Institute, titled "Small, Hot And Parallel: The Future Of Web Browsing (And Computing)".

Since I don't know anything about biology, I'll talk about Web browsers instead. I'm planning to borrow shamelessly from Ras & Co's slides (with permission) to speculate about how one might build an awesome browser for small devices given the physical constraints of both humans and computers. Should be fun.

Monday, 8 October 2007

Tabula Fracta

When considering massive changes to existing code, to cope with fundamental new requirements, it's appropriate to think about whether there is an opportunity to do something completely new. Joel Spolsky isn't always right.

As I mentioned before I think there is an opportunity for someone to create a new browser engine and possibly win. Here's what they could do:

  • Target 8 cores or more --- i.e., pay no attention to performance on fewer cores. Existing engines can't afford to do this, so there's a market niche that they can't enter that will eventually grow to become the entire market --- a classic disruptive play.
  • Leverage the incredible work of the WHATWG on documenting HTML parsing, DOM APIs, etc, to really design code based on specifications that are close to Web-compatible, instead of designing something and having to hack it a hundred different ways to get Web compatibility. Where the WHATWG hasn't covered a topic yet, help them out. Existing engines couldn't do this because they predate the WHATWG work.
  • Use a new programming language with the following features:

    • Memory safety
    • Non-sucky exceptions
    • Real modules
    • UTF8 strings with an iterator API
    • Real global optimization with profile-driven, whole-program, ahead-of-time compilation
    • Immutable types
    • Generators and other syntactic sugar du jour
    • Garbage collection, of course
    • Safe manual memory management using ownership types (I'll write more about this later)
    • Copy-on-write cloning
    • Transactional memory and/or other modern parallel programming primitives
    • Typed message passing (messages should be typed for the same reasons as functions; see Microsoft's Singularity)
    • No crappy, hard-to-optimize, but inexplicably popular features such as identity and synchronization primitives on every object
    • Pluggable runtime, so at compile time you can choose what ABI, GC, etc you want to use ... so you can build components that will play nice in a variety of environments

    This is worth doing because if you're going to make a clean break, you should take the opportunity to make things so much better than C++. Memory safety alone would be a giant step forward for security (and many of the other items are ways to claw back the performance hit of memory safety).

While I think this approach might be a winner, it's still an extremely hard and risky road. Beyond the Joel risks, the biggest problem is the Web compatibility catch-22: you won't really be Web-compatible until you have enough market share that Web developers test against your engine, but you can't get that market share without being Web-compatible. (I suspect this is almost the entire reason why Opera hasn't been a runaway success.) To break out, you either have to work incredibly hard and be astonishingly compelling, or you have to be backed by some organization with great market power. Having said that, WHATWG and Mozilla are working to erode this catch-22, with some success, so things might be easier now than they used to be.

Clearly this approach would be much too risky for an established player to bother with. I would love to see a startup working in this area, though.

Update I should add to the above language requirements list:

    • Natural and efficient mapping to commodity hardware (sorry Haskell fans)

Changing Horses

Matt at Allpeers posted about the pros and cons of Mozilla switching from Gecko to Webkit.

This is something that people in Mozilla have discussed, multiple times, because the pros are quite clear. We get feedback from many channels that Webkit is cleaner and a lot easier to hack on, and I believe that; its code footprint is also clearly better. Beyond that, open source Web browsers and the Web itself could make progress much faster if we were pulling together on the same code base instead of doing everything twice. It would also reduce compatibility pain for Web developers.

The cons are also quite clear, however. We really really really don't want to leave XUL developers --- that is, developers of Firefox addons and XULrunner applications, not to mention our own Firefox and Thunderbird developers --- in the lurch. So XUL and attendant technologies (XBL(2), some XPCOM glue) would have to be ported. Generally any Gecko features that Webkit doesn't have would have to be ported. Considerably bug-compatibility would be inevitably lost.

It's not clear how project management would work. Apple probably doesn't want all that stuff in their tree. Anyway, it would be unsafe for the Web to put all our eggs in their basket.

Some people argue that having two open source Web engines is good because it avoids the problems of monoculture. It can also help in standards bodies to have multiple interoperating implementations of a feature. There's some truth to it, although I think I'd rather have one awesome open source implementation than two less awesome ones. (I think in the long run, a single dominant open source system would be a healthier, more monopoly-resistant software infrastructure than several competing implementations unified via fractious standards bodies. I also think that the split between KDE and GNOME is about the worst thing that ever happened to the free software world.)

The biggest problem, however, is getting from here to there. Mozilla can't leave the market for a few years to attempt a risky transition plan. That would be bad for us and bad for the Web. We have to keep improving Gecko to get maximum leverage from our existing market share, mindshare, and developer share. Also, as I suggested in my second-to-last post, Webkit is not the epitome of engines. It would suck to spend all our resources getting there and then discover we should have gone in another direction.

None of this rules out Mozilla --- or a third party --- doing a XUL-on-Webkit project on the side! I actually think it could be terrific territory for a startup ... far more worthwhile than yet another buzzy social networking site. Or, dare I say it, Flock.

I also think we should keep working on converging the XUL universe with the Web. That means projects like standardizing flex-box layout as a CSS3 module, implementing XBL2, implementing WHATWG drag-drop and bringing it to XUL, ditto for WHATWG SQL APIs, getting missing XUL widgets into HTML5, perhaps creating better standardized ways to bind native code to Javascript, WHATWG APIs for Webrunner-style desktop application management, and more. This would be good for the Web and eventually make it easier to port XUL apps to another engine.

Lastly, let me deflect the charge of "Not Invented Here"ism by stating for the record that I have no great love for Gecko. I did not invent it, I think it has major flaws, and I would be most happy if it vanished and was replaced with something superior in every way. For the forseeable future, that's not an option.

Tragic Vindication

Sadly, I was right; the All Blacks lost to France. The details of my prediction were a little off, but the irrational reactions I expected are appearing now.

The core irrationality is the belief that if you lose a close rugby game, there must be some fundamental reason, some fundamental failure that could have and should have been prevented. This ignores role of chance and circumstances beyond the team's control, which is clearly huge. Refereeing decisions alone can account for a large margin in almost every game. If success requires eliminating the possibility of losing, then a truly successful team would have to win almost every game by 20 points. No team in rugby has ever been that good.

I hope that NZ rugby fans put this in perspective and satisfy themselves with crushing all and sundry between World Cups. Some people also need to get it into perspective with life in general.

A more general lesson: sometimes you can do everything right and you still don't win. Corollary: if you don't win, that doesn't mean you did something wrong ... and you shouldn't panic and make changes unnecessarily. I hear that this is a lesson that poker players need to learn early.

Sunday, 7 October 2007

If I Did It

I think Gecko's frame model is a big mistake.

For those just tuning in, Gecko builds a tree of nsIFrame objects for each document. Each frame represents one CSS box, a rectangular area which renders some content. A single DOM node can generate multiple CSS boxes when it breaks. For example, an inline element or text node can break across multiple lines, creating multiple boxes, and hence multiple nsIFrames in Gecko. The frames for a single DOM node are linked into a doubly-linked list, the "continuation chain".

Continuations are the root of a lot of problems. Changes in layout require changes to the frame tree; frames must be created and destroyed, pointers updated, etc. This requires a lot of code, and it's fragile, lots of bugs leading to memory-safety violations. It also consumes quite a lot of memory. For example suppose you have K nested spans wrapping a large chunk of text, which get laid out across N lines. We create K*N frames, and each one's about 50 bytes. Creating and destroying these frames is probably slow --- it's hard to measure the performance cost of
the continuation model, though.

Continuations are, in principle, good for complicated layout like pagination, columns and advanced layouts like "fill text into this box here, and then when that's full, this other box over here". But over the years, as we've worked out the details, it's become clear that in a CSS context there are grave problems. For example, with vertical breaking and elements with a specified CSS height whose children overflow that height, you run into situations where an element's children have more continuations than the element itself, and so you have no natural parent for the children's continuations. All the options are bad; lately in Gecko fantasai has created "overflow continuations", ghostly additional continuations for the element just to host its overflowing children. It works surprisingly well but nevertheless, the continuation model is leading us down an increasingly complex and disturbing road. And we will need to add still more complexity to fully handle breaking of positioned elements.

Let's take a step back and ask the question, "what's the minimal information that we really need to compute and store during layout, that will let us efficiently paint any section of a page (or handle events or compute offsets requested by script)?" For discussion's sake let's focus on blocks containing inlines.

Here's one answer: all we really need are the line breaks and some information about the y geometry of each line. A line break can be stored as a pair (DOM node, (optional) text offset). For each line we could store the y-offset of its baseline, its ascent and descent, and occasionally an "overflow area" containing the area rendered by all its descendant content if that sticks out beyond the normal line bounds.

In most cases we wouldn't need any kind of layout object for the inlines themselves! From the line information we can compute everything we need easily enough, except for complex stuff like inline-block/inline-table and replaced elements. This could really slash memory usage. It could also speed up layout and even simplify code. (By the way, Webkit doesn't use continuations like ours --- a good
move --- but they do have one render-object per element, so I think it can be improved on.)

Of course lots of code (e.g. painting) needs to inspect the geometry of actual CSS boxes. I'd provide for that by exposing a "box iterator" API that lets clients walk a
hypothetical CSS box tree.

One nifty thing this approach would allow is fairly easy implementation of vertical
writing: the box iterator could transform its output boxes to the correct orientation.
Relative positioning could also be done in the iterator.

Vertical breaking (e.g. columns and pages) could be handled in a similar way, perhaps, by storing the DOM offsets of vertical breaks. So if you have a block that breaks across pages you only have one layout object for the block, but when you ask for the CSS box subtree for the block, you specify the DOM boundaries of the page you're interested in and you get just the boxes for the content in that page.

This isn't a real proposal. There are huge questions about how all the rest of layout could be expressed in this model, huge questions about what happens to frame construction, how to cache things so that painting isn't slow, and so on. There are even huger questions about how even if this was designed, it could be implemented incrementally in Gecko or any other context. It's mainly my daydreaming. But I think there is a point here worth taking, which is that there's probably considerable headroom to improve the design of Gecko, and Webkit too.

Friday, 5 October 2007


David Baron and I tried a new approach to layout triage, using a Google Docs spreadsheet to record our decisions collaboratively. I added some automation steps so the overall workflow goes like this:

  • Use the Bugzilla query to locate bugs which have been requested as 1.9 blockers.
  • Download the query results in CSV format.
  • Use a trivial Perl script to trim off columns I don't want and add a new column which is the Bugzilla URI for each bug.
  • Import the CSV into a new Google Docs spreadsheet.
  • Add columns in which David and I can each write our evaluations.
  • Invite David and Damon as collaborators.
  • David and I read the bugs (with the help of the URIs, which Google Docs makes clickable) and add evaluations to the spreadsheet (the Real Work).
  • Discuss any differing evaluations until we converge on a decision. Mark decisions by bolding the evaluation we settle on.
  • Export the spreadsheet to HTML and save it locally.
  • Run another Perl script to identify the bolded decision for each bug and generate a Coscripter script to update all the Bugzilla bugs!
  • Run the Coscripter script.
  • Go to lunch.
  • Return from lunch, discover that my laptop went to sleep while running the script, and restart it from the last not-updated bug.

I'm not sure whether this saved much time, in the end, compared to just exchanging comments via email and marking bugs manually, but it was a lot more fun! (And marking bugs manually is very very tedious, at least when you're on a different continent to Bugzilla.)

Also, during this exercise I played around with Google Docs a little bit and discovered that their charting tools use SVG! Excellent!

Here's the Coscripter-generation-script, by the way:

#!/usr/bin/perl -w
sub resolve {
my ($bug, $resolution) = @_;
if ($resolution eq "blocking") {
print "* go to \"$bug\"
* select \"+\" from the \"blocking1.9\"'s \"blocking1.9\" listbox
* click the \"Commit\" button\n";
} elsif ($resolution eq "wanted") {
print "* go to \"$bug\"
* select \"-\" from the \"blocking1.9\"'s \"blocking1.9\" listbox
* enter \"[wanted-1.9]\" into the \"W\" textbox
* click the \"Commit\" button\n";
} elsif ($resolution eq "minus") {
print "* go to \"$bug\"
* select \"-\" from the \"blocking1.9\"'s \"blocking1.9\" listbox
* click the \"Commit\" button\n";
my $bug;
while (<>) {
foreach my $cell (split(/<td/)) {
if ($cell =~ m!(http://.*)!) {
$bug = $1;
if ($cell =~ m!class='g s4'>(\w+)!) {
&resolve($bug, $1);