Eyes Above The Waves

Robert O'Callahan. Christian. Repatriate Kiwi. Hacker.

Thursday 15 November 2007

Pegs, Holes And Reflow

I already blogged about Gecko's frame continuations being a mistake. It's time to rant about another big layout design error: the uniformity of the Reflow() API.

During layout, frames (i.e. CSS boxes) call Reflow to lay out their children and then position those children in some manner. The core problem is that Reflow's signature is the same for every frame type. That means that in theory, table frames have to interact with their row(group) frames in exactly the same way that inline frames have to interact with their inline children, or XUL boxes have to interact with their box children, etc. But this is nonsense, since the invariants, dependencies and data that need to cross the parent/child boundary are entirely different in each of these cases.

So in practice we have various kinds of magic probing and/or magic fields of structures that are only used in certain cases. This inevitably leads to breakage and overcomplex code. Problems are particularly acute in line layout, where we really would like to work with a flattened list of inline leaf items, but thanks to Reflow we can't. The textrun work has moved us to a situation where we construct a flattened representation of at least the text of a paragraph for line breaking and text shaping, and then we pull data from that in a hokey way as we Reflow text frames. It's somewhat painful as the two traversal models fight each other in nsLineLayout. Of course the aforementioned frame continuations crash the party too.

What we really should do is admit that parent-child frame type pairs are actually tightly constrained, i.e. an inline frame can only have an inline parent or a block parent, a table row frame can only have a table rowgroup parent which can only have a table parent, etc. (These invariants already exist actually, although the code sometimes doesn't want to admit it.) Then we should toss Reflow over the side and give each frame type its own specialized layout interface. Only a few frame types (e.g. blocks and tables) would be able to occur in "unknown" contexts and offer a generic layout interface.

In fact, we could go further and specialize further the types of child frames or even get rid of a uniform nsIFrame tree altogether. There's no particular reason why it makes sense to have tables, table rowgroups, table rows and table cells all be the same kind of object as text frames or SVG frames. Sure, we want to be able to iterate over CSS box tree geometry and style somehow, but there's a whole lot of generality exposed in nsIFrame that doesn't really make sense everywhere.

As an aside, the conflict between textrun layout and inline layout is an example of a general problem I've seen with cleaning up code. You rework module A to make it nice and clean, but because modules B and C are quirky, you have to add complexity to module A to keep things working and shippable. Eventually you rework B and C too but A's already been contaminated, so B and C inherit some of the sins of their fathers. You can reduce the effect by enlarging the scope of rework, but that adds risk and schedule issues of course --- in the limit you rewrite everything and sink your entire project.


I sometimes wonder whether XUL's approach of having the layout algorithm live outside the boxes is the way to go. If we did it right, for example, that would allow this markup to work per spec:
<div style="display: table-row">
<span style="display: table-cell">
An image:
<img style="diplay: table-cell">
Right now it doesn't work because table rows (and tables in general) assume they can cast the relevant frames to nsTableCellFrame...
Robert O'Callahan
That's a slightly different problem I think --- the problem that frame types are overloaded for different uses. We use them to control external layout behaviour (CSS3 display-role) and also internal layout behaviour (CSS3 display-model), and to control replaced-element behaviour (which is akin to display-model). And of course we also tie in other things like abs-pos-container-ness (to nsBlockFrame). When we need extra behaviours we create wrapper frame types like nsHTMLScrollFrame and nsColumnSetFrame, but wrapper frames have their own problems (like insertion point confusion).
Okay now I'm confused. Is the following bug 367576 due to the original problem in this post, or the one about nsTableCellFrame in the first comment?
<table rules="rows">
<caption>This table has rules="rows":</caption>
<tr><td>Normal cell</td><td rowspan="2">This cell has rowspan="2"</td></tr>
<tr><td>Normal cell</td></tr>
<tr><td style="border-bottom: 16px solid black">border-bottom: 16px solid black</td><td style="border-top: 16px solid black" rowspan="3">This cell has rowspan="3" and<br>border-top: 16px solid black</td></tr>
tr><td>Normal cell</td>
<td style="border-bottom: 2px solid black" rowspan="2">This cell is rowspan="2" and<br>border-bottom: 2px solid black</td>
<td>Normal cell</td>
The cell border from the first column escapes into the inner part of the second column cell.
I ran the above, with the typo correction and want to let you know that I believe it to be a Redraw/Reflow problem. Slide something over it and when you uncover it, it's corrected. I checked the bug and found no mention of this, so I hope it helps.