Tuesday, 24 June 2008

Some 1.9.1 Updates

Over the last couple of weeks I've done a variety of fun things. I've got patches for the remaining Acid3 layout bugs (excluding CSS @font-face, which isn't really a layout issue). There were only actually four layout bugs affecting the rendering of Acid3. One was to support text-shadow, which landed a while ago (I'll post Michael Ventnor's guest blog about that soon). Another one was to allow floats that aren't at the start of a line to be placed on that line instead of always pushed to the next line; I took this one on a few weeks ago because I thought it would be scary, but it turned out to be easy now thanks to the big changes David Baron made in 1.9, so the fix for that landed a couple of weeks ago.

Another Acid3 bug is that absolutely positioned elements with no positioned ancestor element should be positioned relative to the initial containing block, but in Gecko 1.9 they're positioned relative to the padding edge of the root element. I have a nice fix for this that also enables positioning of the root element. It hasn't landed yet, parts still need review.

The last Acid3 layout bug is that Gecko 1.9 follows the CSS2 restrictions on generated content (::before/::after), which were lifted in CSS2.1. In particular we don't allow generated content to be positioned. We also don't allow it to be floated, or be any 'display' type except block or inline, or overflow:auto, or use columns, etc. So I have a patch that reworks generated content to support all of those things by treating it much more like normal content. It actually simplifies the code overall IMHO. This still needs review before it can land.

I've started cutting up my SVG-CSS-integration branch and submitting pieces for review. One part that's in the process of landing is infrastructure for tracking which element is associated with a given ID. This is important in SVG; if, say, an SVG <use> element references URI "#foo", it will need to be updated every time the canonical "element with ID 'foo'" changes, and currently we fail to do that. For example, that would happen if an element with id='foo' is inserted into the document before the element that is currently the canonical 'foo' element, or if the element with id='foo' gets its ID changed to 'bar', or if the element with id='foo' is removed from the document, or if an element earlier in the document gets its 'id' set to 'foo', etc. This isn't actually that hard, since to make getElementById fast we already have a table that maps IDs to elements, so we just need to extend that so if an entry for a given ID changes, we notify observers associated with that ID. However, when I implemented this I discovered that we actually have three implementations of getElementById! One is in nsHTMLDocument (used for HTML/XHTML documents), and is reasonably clean and does what you expect. One is in nsXMLDocument, and is completely stupid and uses no hashtable at all. The other is in nsXULDocument, and is rather insane. It uses its own very ugly hashtable-ish structure and stores not only elements by 'id', but also elements by 'ref'. So in preliminary patches, which I actually landed today, I hoist the good implementation of getElementById from nsHTMLDocument to nsDocument and get rid of the dumb one in nsXMLDocument. I also force nsXULDocument to use the shared table for elements indexed by 'id', and give it a separate table of elements indexed by 'ref' so that getElementById can continue to do its crazy thing for XUL documents. Maybe one day we can get rid of that. Anyway, at least things are getting better.

In the last couple of days I decided to implement white-space:pre-line since it's one of the last bits of CSS 2.1 we don't implement and it's easy to do on top of the improved text subsystem we have in 1.9. The patch for that is up for review too.

I'd quite like to implement text-overflow, but it turns out to be complicated to spec --- there's a reason why it's not in the CSS3 Text draft :-). There are several incompatibilities in the way browsers have implemented it today; for example, only IE7 makes an attempt to handle bidi in a reasonable way. There are several other issues that need to be tested and resolved. The bug contains gory details.

Thursday, 19 June 2008

Hits And Misses

It feels great to bask in the somewhat erratic but generally glowing coverage of the Firefox 3 release. This Forbes article captures a big part of the reason why I do this work:

Firefox has become one of the most important pieces of software around today as consumers shift from using their PCs to run applications living on their hard drives to a communications device able to connect with applications living on distant servers.

I do have to laugh at this Fox News tidbit though:

... you could say the several hundred engineers working on Firefox have been busy. And their work has paid off.

I'm glad it looks like the work of several hundred engineers, and I guess if you count up every single person who contributed the tiniest patch over the last few years it might be several hundred, but that would be a very misleading picture. Mozilla Corporation's Gecko team is currently less than forty full-time developers and that's after rapid recent growth --- over most of the last five years it was more like ten to twenty. Non-MoCo contributors are great but their efforts combined add no more than 50%. I don't know the size of the front-end team but it's smaller than the Gecko team.

Normalizing for project scope, I think we're about the same size as Apple's Webkit team. I get the impression that the IE team is much larger than us at this point. (An officemate suggests that the reason they send us cakes is because we keep them in their jobs.) Not sure what Opera's development teams are like but Opera has about three times the total full-time employees as Mozilla today, and the ratio must have been much greater a few years ago.

I wanted to say that because I think a lot of people see Mozilla as a behemoth --- I hope because we punch above our weight in the market --- but we really aren't, far from it.

Wednesday, 18 June 2008

Eager Backouts

David Baron has argued that if someone checks in a patch and we later discover a serious regression from the patch, one that was not caught by tests --- we should behave the same as if there had been tests for that regression before the patch was checked in.

The logic is compelling and I think I mostly agree. My only caveat is that once a patch has been landed for a while, the cost of backing it out --- due to other changes that have built on the patch --- may rise to make this impractical. But for most changes, big and small, there's a reasonably large window where that won't be the case.

In practice it would mean backing out big patches a lot more often. That's probably a good thing, if we take seriously the goal of having the trunk remain in a shippable state at all times. I think it also lowers risk quite a bit ... compare the histories

Check in A

Back out A

Check in A+B

Back out A+B

Check in A+B+C

Check in A

Check in B

Check in C

With the backout approach we would be much less likely to get into sticky situations where we don't know what has to be backed out to get to a good place. And because we periodically get the trunk back into a good state, it would be easier to distinguish the effects of A+B+C development from other ongoing trunk development.

Distributed version control makes this approach more practical because it's now much easier to maintain A+B+C on a local branch until it finally sticks.

People may argue that with long-lived experimental branches, tryserver builds, and improving test infrastructure, we won't have belatedly-discovered regressions, but I'm not that optimistic :-).

Advanced Topics In Computer Science #2: Rectangles

A while ago I wrote about the deep computer science problems of strings. Now I need to talk about another poorly understood data structure: rectangles. There are a couple of issues.

x1,y1,x2,y2 vs x,y,width,height Windows uses the former, most other libraries that I know of use the latter. If you mostly work with coordinates relative to the rectangle origin, then you'll probably prefer width+height, but otherwise I actually think Windows got this right! A lot of rectangle operations, such as union, intersection, and point containment are simpler in the x2+y2 format.

Union of empty rectangles Should the 'union' operation ignore empty rectangles or include them in the calculations? For example, what is the union of (100,100)-(100,200) with (100,100)-(200,100)? It depends on the use-case and is quite subtle. When the rectangle represents a set of pixels, for example a damage area that needs to be repainted, then you want 'union' to ignore empty rectangles, so the result in my example could be any empty rectangle. But when the rectangle represents the bounds of a set of points, then you want to return a non-empty rectangle in this case --- (100,100)-(200,200). (The difference between a pixel and a point is that a pixel has area, usually a 1x1 square, but a point is a mathematical point of zero size.) An example of this latter usage is when you're computing the bounds of a set of CSS boxes in order to determine how big a container must be to include them all.

So we want two forms of 'union' operator, or perhaps better still we should have two entirely different data structures for these two tasks. For pixel-set rectangles, there is only one logical "empty rectangle" --- all rectangles with zero width or height should be considered equal. For point-set rectangles, even if the width is zero the height and top-left can still be significant, and vice versa for height. This suggests we should separate the concepts.

The problem with having two data structures is that very often we need to convert one to the other --- to find the set of pixels that cover a point-set rectangle. (Off the top of my head, I can't think of situations where we need to convert pixel-sets to point-sets.) It would also be rather confusing for developers since I'm not aware of any libraries that make this distinction. Also, in Gecko, rectangles like the "overflow areas" are used to mean both pixel-sets and point-sets. For now, we'll just stick with different 'union' operators.

Friday, 13 June 2008

Working Overtime

In the 18 months since I became a contractor for Mozilla Corporation, I've started dreaming about work --- I never did before. This is probably significant.

Last night I dreamed I was in a meeting with people who were showing me stacks of SIGGRAPH papers and trying to convince me to implement an absurdly complex specification that would let them do all graphical effects known to man. I was pushing back, trying to explain that it was too complex and most of the effects could be done in other simpler ways. No surprise that I'm dreaming about that.

The cool part is that they showed me some awesome demos --- and I remember the demos! I remember three:

  • Chinese wall: Chinese text was rendered as if it was carved into a stone wall and aged for a thousand years, eroded and dusty. Actually in the dream there was an animation with multiple layers of text being carved in and eroded away --- a palimpsest.
  • Persian carpet: This was taking an image and rendering it as if it was a pattern woven into a carpet --- a really old, worn, tattered carpet with loose threads hanging out in a few places.
  • Kaleidoscope: Generic kaleidoscope effect, random-ish patterns with sixfold symmetry.

The Chinese wall can probably be done with clever use of SVG filters. Possibly the Persian carpet could be done with SVG filters but it would be tricky and might require auxiliary images. I don't think we have any way to tessellate non-rectangular shapes over the plane.

I never knew I was that creative.

Just around the time I woke up from this dream I also figured out how we can use SVG filters with <canvas>. We just need to a drawElement method that lets you use any element as a source to draw into the canvas. Then you can set up an offscreen canvas with 'filter' set on it, draw the stuff you want filtered into the offscreen canvas, then call drawElement to draw the filtered offscreen canvas into a target canvas. Note that 'drawImage' (or an extension thereof) isn't exactly what you want here since that renders the content of a canvas ignoring CSS effects applied to the canvas itself.

Thursday, 5 June 2008


Applying SVG Effects To HTML Content


One of the problems with the standards-based Web is that it's hard to use SVG's features to enhance HTML content. For example, there is no reasonable way to clip an HTML element to a non-rectangular region, or to apply an alpha mask to an HTML element, or to apply image processing effects such as color channel manipulation to HTML elements. SVG has these features but they can only be applied to SVG elements. You can embed the HTML in a <foreignObject>, but that has major limitations; in particular, you can't apply effects to HTML elements via stylesheets without changing the markup --- ripping out content and putting it inside ugly SVG fragments --- and breaking most CSS layouts in the process.

One approach to solving this is to create new CSS properties for the desired effects. Dave Hyatt's been doing this a bit in Webkit, and the approach has its place. However for complex effects such as clipping to arbitrary shapes and custom image processing, CSS isn't really up to the task. One problem is that CSS isn't good at manipulating structured values like shapes and filter processing stacks; they're cumbersome to write in CSS expression syntax, or else they require new custom CSS syntax (e.g. @-rules), and there's no standard DOM to let scripts manipulate components of these structured values. Another issue is that we should
try to avoid duplicating specification and implementation of complex features.

So I've been experimenting with better ways to apply SVG effects to HTML content. The first step is to make SVG's 'clip-path',
'mask' and
'filter' properties work when applied to HTML content.


Here's some XHTML markup that clips elements to a shape composed of a circle next to
a rectangle.

<html xmlns="http://www.w3.org/1999/xhtml"
<body style="background:#ccc; font-size:30px;">
p { width:300px; border:1px solid black; display:inline-block; margin:1em; }
iframe { width:300px; height:300px; border:none; }
b { outline:1px dotted blue; }
<p class="target" style="background:lime;">
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua. Ut enim ad minim veniam.</p>
<iframe class="target" src="http://mozilla.org"/>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing
<b class="target">elit, sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.</b> Ut enim ad minim veniam.</p>

<style>.target { clip-path: url(#c1); }</style>
<svg:svg height="0">
<svg:clipPath id="c1" clipPathUnits="objectBoundingBox">
<svg:circle cx="0.25" cy="0.25" r="0.25" id="circle"/>
<svg:rect x="0.5" y="0.2" width="0.5" height="0.8"/>

So we have a block element, an <iframe>, and an inline element all clipped by the same clipPath. The clipPath shape coordinates are relative to the bounding-box of the clipped object.

Here's the rendering:

screenshot of elements clipped by a circle and rectangle

You can't see it in this example, but hit-testing is affected by clip-path as you'd expected; mouse events in the clipped-out area pass through to the element(s) underneath.

One of the tricky integration points is handling elements which generate multiple CSS boxes, such as the inline element here. I don't expect clipping and masking to find much use there, but filters can be useful for inlines. The approach that makes most sense is to apply the effects to the whole element at once, and make the SVG "object bounding box" be the rectangle union of all border-boxes for the element and its geometric descendants, plus any outlines on the element or descendants. This is useful and also seems in accord with the spirit of the SVG spec.

clipPath, mask and filter content can also use "userSpaceOnUse" units. In SVG,
"user space" is established by the SVG viewport containing the affected content. We don't have such a viewport for non-SVG content, so I make "user space" be the rectangle which is the union of all border-boxes for the affected element. User space is in CSS pixel units, so clipPath, mask and filter content can specify lengths in CSS pixels as well as percentages relative to the size of the affected element.


It's easy to add some dynamism to the above example:

  <button onclick="toggleRadius()">Toggle radius</button><br/>
function toggleRadius() {
var circle = document.getElementById("circle");
circle.r.baseVal.value = 0.40 - circle.r.baseVal.value;

The SVG DOM isn't as clean as it could be, but it's adequate.


Replace the 'clip-path' chunk above with 'mask':

  <style>.target { mask: url(#m1); }</style>
<svg:svg height="0">
<svg:mask id="m1" maskUnits="objectBoundingBox" maskContentUnits="objectBoundingBox">
<svg:linearGradient id="g" gradientUnits="objectBoundingBox" x2="0" y2="1">
<svg:stop stop-color="white" offset="0"/>
<svg:stop stop-color="white" stop-opacity="0" offset="1"/>
<svg:circle cx="0.25" cy="0.25" r="0.25" id="circle" fill="white"/>
<svg:rect x="0.5" y="0.2" width="0.5" height="0.8" fill="url(#g)"/>

Now the rectangle is being painted with a translucent gradient and the shape is being used
as an alpha mask instead of just clipping.

Here's the rendering:

screenshot of elements masked by a circle and rectangle with gradient

As per the SVG spec, masks do not affect hit-testing, so events will be received even where content is masked out. It would be smart to implement the SVG
pointer-events property for non-HTML content to give authors a way to control this.


Replace the 'mask' chunk with some filters:

p.target { filter:url(#f3); }
body:hover p.target { filter:url(#f5); }
b.target { filter:url(#f1); }
body:hover b.target { filter:url(#f4); }
iframe.target { filter:url(#f2); }
body:hover iframe.target { filter:url(#f3); }
<svg:svg height="0">
<svg:filter id="f1">
<svg:feGaussianBlur stdDeviation="3"/>
<svg:filter id="f2">
<svg:feColorMatrix values="0.3333 0.3333 0.3333 0 0
0.3333 0.3333 0.3333 0 0
0.3333 0.3333 0.3333 0 0
0 0 0 1 0"/>
<svg:filter id="f3">
<svg:feConvolveMatrix filterRes="100 100" style="color-interpolation-filters:sRGB"
order="3" kernelMatrix="0 -1 0 -1 4 -1 0 -1 0" preserveAlpha="true"/>
<svg:filter id="f4">
<svg:feSpecularLighting surfaceScale="5" specularConstant="1"
specularExponent="10" lighting-color="white">
<svg:fePointLight x="-5000" y="-10000" z="20000"/>
<svg:filter id="f5">
<svg:feColorMatrix values="1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 1 0 0 0" style="color-interpolation-filters:sRGB"/>

Filters are cool so I decided to go a bit over the top here. The block element gets an edge-detection convolution filter. The <iframe> gets converted to grayscale.
The inline element is blurred. When you hover over the body, the filters change; the <iframe> gets the edge detection filter, the block gets a color channel transformation that takes the alpha channel to the green channel, creating a punch-out effect, and the inline element gets a 3D lighting effect.

Here are the renderings:

screenshot of elements with filters applied

screenshot of elements with more filters applied

Filters do not affect hit-testing.


All these effects are fully live. You can select, zoom, change the DOM, invoke contenteditable, etc, and everything works fine. Effects are applied to the drawing of the selection, and even to the drawing of the caret while it's in an affected element.

The effects following the SVG composition model; if an element has a filter, clipping and masking, then the filter is applied first followed by the clipping and/or masking. They behave like CSS 'opacity' in that they induce a pseudo-stacking-context for the element. They do not affect layout, although 'filter' can produce drawing outside the bounds of the element (e.g., shadows). Currently, in my implementation, 'filter' can actually affect layout because when 'filter' paints outside the element we may show scrollbars so that the user can scroll to see the overflowing paint, but I think this should be considered a bug.

Since we don't yet support remote content references in Gecko, you have to put the effects fragments inline in your document. This is ugly and also means you can't use these effects in non-XML HTML. Once we implement remote content references, these problems will go away; author CSS style sheets will be able to reference effects in auxiliary SVG documents. Many effects can be stored in a single SVG document, amortizing the cost of an extra resource. Authors will even be able to use tools like
Inkscape to generate SVG effects and reference them from CSS.

At a spec level, there's very little that needs to be said. No new syntax is required.
A specification needs to be written documenting decisions on the issues I've mentioned above (not necessarily the same decisions I made :-) ).

In lieu of a spec, we have to decide how much of this to take in Gecko and when. It's nice that there's no new syntax, but that also means there's no convenient way to use -moz prefixing to isolate the non-standard features. We could create new
'-moz-filter'/'-moz-clip-path'/'-moz-mask' properties that behave like the existing properties except they also apply to HTML. Maybe it's not worth it. Something for discussion.

I'm making tryserver builds right now, and I'll update this post with a link when they're ready. Here's a link to my Mercurial repository. (Unfortunately, I accidentally pushed copies of NSPR and NSS to that repo, so don't blindly pull from it unless you're OK with that.) Some demos:

At some point I'll post a followup talking about the other work I've done on this branch that leads up to these features, with some more details about the implementation. I'm also planning to add some more features soon.


A nice side effect of providing better SVG-HTML integration is that it gives SVG a leg up on the Web. You can't do these effects using Flash or Silverlight, and since they're not standards they probably won't ever be invited to this party.

Update Mac build, Linux build, Windows build.

Belorussian translation