Saturday 6 October 2007
I think Gecko's frame model is a big mistake.
For those just tuning in, Gecko builds a tree of nsIFrame objects for each document. Each frame represents one CSS box, a rectangular area which renders some content. A single DOM node can generate multiple CSS boxes when it breaks. For example, an inline element or text node can break across multiple lines, creating multiple boxes, and hence multiple nsIFrames in Gecko. The frames for a single DOM node are linked into a doubly-linked list, the "continuation chain".
Continuations are the root of a lot of problems. Changes in layout require changes to the frame tree; frames must be created and destroyed, pointers updated, etc. This requires a lot of code, and it's fragile, lots of bugs leading to memory-safety violations. It also consumes quite a lot of memory. For example suppose you have K nested spans wrapping a large chunk of text, which get laid out across N lines. We create K*N frames, and each one's about 50 bytes. Creating and destroying these frames is probably slow --- it's hard to measure the performance cost of
the continuation model, though.
Continuations are, in principle, good for complicated layout like pagination, columns and advanced layouts like "fill text into this box here, and then when that's full, this other box over here". But over the years, as we've worked out the details, it's become clear that in a CSS context there are grave problems. For example, with vertical breaking and elements with a specified CSS height whose children overflow that height, you run into situations where an element's children have more continuations than the element itself, and so you have no natural parent for the children's continuations. All the options are bad; lately in Gecko fantasai has created "overflow continuations", ghostly additional continuations for the element just to host its overflowing children. It works surprisingly well but nevertheless, the continuation model is leading us down an increasingly complex and disturbing road. And we will need to add still more complexity to fully handle breaking of positioned elements.
Let's take a step back and ask the question, "what's the minimal information that we really need to compute and store during layout, that will let us efficiently paint any section of a page (or handle events or compute offsets requested by script)?" For discussion's sake let's focus on blocks containing inlines.
Here's one answer: all we really need are the line breaks and some information about the y geometry of each line. A line break can be stored as a pair (DOM node, (optional) text offset). For each line we could store the y-offset of its baseline, its ascent and descent, and occasionally an "overflow area" containing the area rendered by all its descendant content if that sticks out beyond the normal line bounds.
In most cases we wouldn't need any kind of layout object for the inlines themselves! From the line information we can compute everything we need easily enough, except for complex stuff like inline-block/inline-table and replaced elements. This could really slash memory usage. It could also speed up layout and even simplify code. (By the way, Webkit doesn't use continuations like ours --- a good
move --- but they do have one render-object per element, so I think it can be improved on.)
Of course lots of code (e.g. painting) needs to inspect the geometry of actual CSS boxes. I'd provide for that by exposing a "box iterator" API that lets clients walk a
hypothetical CSS box tree.
One nifty thing this approach would allow is fairly easy implementation of vertical
writing: the box iterator could transform its output boxes to the correct orientation.
Relative positioning could also be done in the iterator.
Vertical breaking (e.g. columns and pages) could be handled in a similar way, perhaps, by storing the DOM offsets of vertical breaks. So if you have a block that breaks across pages you only have one layout object for the block, but when you ask for the CSS box subtree for the block, you specify the DOM boundaries of the page you're interested in and you get just the boxes for the content in that page.
This isn't a real proposal. There are huge questions about how all the rest of layout could be expressed in this model, huge questions about what happens to frame construction, how to cache things so that painting isn't slow, and so on. There are even huger questions about how even if this was designed, it could be implemented incrementally in Gecko or any other context. It's mainly my daydreaming. But I think there is a point here worth taking, which is that there's probably considerable headroom to improve the design of Gecko, and Webkit too.