Thursday, 31 July 2008

Mixed News

Whistler seems to have been cut off from Vancouver. If this had happened on Thursday night, we'd have been rescheduling 400 flights. As it is, it's still unclear whether the road will be open by the time people have to leave on Friday. Apparently the alternative ways out are float plane or a six-hour road trip. This could be interesting.

In other news, Mike Shaver announced that we intend to ship Ogg Vorbis and Theora codecs for the HTML5 <audio> and <video> elements in Firefox 3.1. This is a huge step and I'm very proud that Mozilla is willing to take this on. I had the privilege of checking in and enabling Chris Double's Ogg patch last night, so this is enabled in the nightly builds. Check it out!

Update Apparently the road will be closed for five days. So we'll be bussing out the long way around. I'll probably have to leave at 11am for my 9pm flight...

Update #2 Apparently I'm leaving at 8am. 8 hours on the bus, 5 hours at the airport, 13 hours on the plane...



Thursday, 24 July 2008

SVG Filter Performance Improvements In Gecko 1.9.1

The first batch of work from my bling-branch to land on trunk is improvements to SVG filter performance. I didn't want to make filters apply to HTML content but totally suck performance-wise.

I chose to focus on testcases that use filters to make drop shadows, since that's a very common usage pattern. In particular I wanted to test scrolling of those pages, since people tend to notice slow update on scrolling more than an initial slow paint. I created a simple benchmark for this.

The first major piece of work was to micro-optimize the Gaussian blur inner loop. I tried a lot of experiments, some of which paid off and others which didn't. I ended up speeding it up by about 10%, not as much as I'd hoped, but I did eliminate the use of a huge lookup table which should save memory.

The next approach was to optimize Gaussian blur so that when the input surface only has an alpha channel (i.e. the color channels are all 0), we don't do any work for the color channels. This happens when the source is "sourceAlpha", as it is for typical shadow effects. First I did some major refactoring of the filters code so that various bits of metadata can be propagated around the SSA-converted filter primitive graph, instead of having a dynamic "image dictionary". Then the actual optimization was easy. This made us another 25% faster.

As part of the refactoring I reduced the usage of intermediate surfaces --- we free a filter primitive output image as soon as we finish processing the last filter primitive that uses it as an input. This wasn't intended to improve performance but it did, by about 5%.

The next idea was to only run filter computation over the minimum area needed to correctly repaint the damage area, when only part of the window needs to be repainted --- important for scrolling, since when scrolling typically only a small sliver of the window is repainted. This is a bit tricky since filter primitives may need to consume a larger area of their input than their output, e.g., a blur may require the output area to be inflated by the blur radius to find the input area required. But I'd already implemented this knowledge for Firefox 3, to limit the size of the temporary surfaces we were allocating when a poor filter region was given by the author. It was just a matter of introducing damage area information into the mix. This gave us a 140% speedup! (By "speedup" here I mean the increase in the number of iterations of the test you can run in a given time limit.) In general this is a really good optimization because it means, for most filters, the time required to draw the filter is proportional to the size of the visible part of the filter, not proportional to the size of the filtered SVG objects. At this point I declared victory on the initial use case...

The final idea was to address a slightly different testcase. When only a small part of an image changes, but there's a filter applying to the whole image, we'd like to only have to recompute a small part of the filter. This is similar to the previous paragraph, and requires forward propagation along the filter primitive graph of bounding boxes of changed pixels. My fix here improved performance on that testcase by 70%.

There's still a lot more that could be done to improve filter performance. There are three obvious approaches:


  • Use CPU vector instructions such as SSE2
  • Perform run-time code generation to generate optimized code for particular filter instances
  • Use the GPU

You really want to support all three. You definitely need some sort of RTCG to perform loop fusion, so that instead of doing each filter primitive as a separate pass, you can minimize the number of passes over memory. If your code generator supports vector types and intrinsics, then it's easy to give it vector code as input and generate de-vectorized code if the CPU doesn't have the right vector instructions. And if you're super-cool you would allow the code generator to target the GPU for filter fragments where that makes sense.

However, at least as far as Gecko is concerned, this additional work will have to wait until filter performance rises in priority. (At that point hopefully we'll be able to reuse the JIT infrastructure being developed for JS.)



Tuesday, 22 July 2008

Wellington

Last week, during the school holidays, our family took a trip to Wellington. It's a pretty good winter destination, with plenty of indoor activities, especially the Te Papa museum. The weather wasn't too bad so we also got outside; notably we completed the "Southern Walkway" from Oriental Bay to Island Bay --- a significant achievement to have two small children walking for five hours. They're definitely following in my footsteps (so to speak).

Te Papa was good, but I have to say (speaking as a devotee of geological spectacle) that their Earthquake Room isn't quite as good as the Auckland Museum's Volcano Room. We went on the tour of the Beehive and other government buildings, which was considerably more interesting than I expected (and led by a man with a strong US accent, curiously). Overall Wellington was lots of fun.

We flew down but took the Overlander train back to Auckland. The train is slow --- took us over 12 hours --- but a great experience for all. The views were magnificent even though it rained much of the time and we did not encounter any snow, and the great volcanoes of the plateau were shrouded in cloud. That's two out of two times our family's been to the plateau and failed to see them, but we'll keep trying! Probably the thing to do is to wait for a big snowfall and clear conditions and jump on the train the very next day.

Here's a picture of Houghton Bay near the end of the Southern Walkway. It got a bit wet and windy near the end of the day but --- thank the Lord --- it wasn't cold.


Houghton Bay


Wednesday, 16 July 2008

ROC Scheduled Maintenance - Wednesday 16/7/2008 to Sunday 20/7/2008

It's the school holidays and our family's going down to Wellington for a few days for fun. We plan to fly down and take the train back, hopefully scoring great views both ways of the wintry volcanic plateau. (One of my more enduring early memories is seeing that area from the train, covered in snow.) In Wellington we plan to visit the Te Papa museum, which none of us have ever seen, and there should be plenty of other fun things to do.

I am going up to visit Victoria University briefly to talk to some people there. That should be a lot of fun too. Technically it might count as work, but it's the only work I'll be doing, since I will not be taking my laptop nor any other device I would use for Internet communication, so don't expect any response from me along the usual channels. I believe I'm pretty much on top of things at the moment, so hopefully no-one will be inconvenienced.

OK I probably lied in the last paragraph --- I won't be able to completely stop thinking about browser engines. Sigh. Definitely a sign of spiritual weakness.



Wednesday, 9 July 2008

Using Arbitrary Elements As Paint Servers

The latest feature in my bling branch is the ability to use any element as a paint-server CSS background for another element.

There are a few motivations for this feature. Probably the biggest usage of the canvas "drawWindow" API extension in Mozilla is to create thumbnails of Web content. The problem is, drawWindow necessarily creates "snapshot" thumbnails. Wouldn't it be cooler if there was an easy way to create live thumbnails --- essentially, extra live viewports into Web content? Now there is. It should be pretty easy to add this to S5 to get slide thumbnails, for example.

Another feature that's popular these days is reflections. With element-as-background, plus a small dose of transforms and masks, reflections are easy. A while ago Hyatt introduced a feature in Webkit to use a <canvas> as a CSS background; that falls out as a special case of element-as-background.

So here's an example:


An HTML element with a rotating canvas background

And here's the markup:

<!DOCTYPE HTML>
<html>
<head>
<style>
p { background:url(#d); }
</style>
</head>
<body style="background:yellow;">
<p style="width:60%; border:1px solid black; margin-left:100px;">
"Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute
irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum."
</p>
<canvas id="d" width="50" height="50"></canvas>
<script>
var d = document.getElementById("d");
var iteration = 0;
function iterate() {
++iteration;
var ctx = d.getContext("2d");
ctx.save();
ctx.clearRect(0, 0, 50, 50);
ctx.translate(25,25);
ctx.rotate(Math.PI*iteration/180);
ctx.fillStyle = "lime";
ctx.fillRect(-10, -10, 20, 20);
ctx.restore();
setTimeout(iterate, 10);
}
iterate();
</script>
</body>
</html>

Unlike SVG paint servers, elements-as-backgrounds have an intrinsic size. Staying consistent with my earlier work on SVG effects for HTML, I define the intrinsic size as the bounding box of the border-boxes for the element.

As with SVG paint servers, an element-as-background is subject to all CSS background effects, including 'background-repeat' as in the example above.

Of course, the first thing any self-respecting computer scientist will think of when they read about this feature is, "can I exploit infinite recursion to create a hall-of-mirrors effect or sell exploits to gangsters?" No. We refuse to render an element-as-background if we're already in the middle of rendering the same element-as-background.

The builds I linked to in my previous post contain this feature. I've uploaded the above example and the reflection demo.

The next thing I have to do is to write up a spec proposal for all this work and get it discussed by the CSS and SVG working groups. Based on that feedback we'll figure out the best way to deliver this functionality in Gecko. Unfortunately, the approach of "make existing syntax applicable in more situations" is not amenable to using vendor prefixes to isolate experimental features.



SVG Paint Servers For HTML

I've done some more work on applying SVG effects to HTML. This time I've made SVG paint servers (i.e., gradients and patterns) usable as CSS 'background' images.


HTML with SVG gradients and patterns

Here's the markup:
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:svg="http://www.w3.org/2000/svg">
<head>
<style>
h1 { background:url(#h); }
p { background:url(#p); }
span { background:url(#h); }
</style>
</head>
<body>
<h1 style="width:95%;">Heading</h1>
<p style="width:90%; border:1px solid black; margin-left:2em;">
"Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute
irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum."
</p>
<div style="width:200px;">"Lorem ipsum dolor sit amet,
<span>consectetur adipisicing elit, sed do eiusmod</span>
tempor incididunt ut labore et dolore magna aliqua.</div>
<svg:svg style="height:0">
<svg:linearGradient id="h" x2="1" y2="0">
<svg:stop stop-color="yellow" offset="0"/>
<svg:stop stop-color="yellow" stop-opacity="0" offset="1"/>
</svg:linearGradient>
<svg:pattern id="p" patternUnits="userSpaceOnUse"
x="0" y="0" width="50" height="50"
viewBox="-1 -1 9 5.5" >
<svg:path d="M 0 0 L 7 0 L 3.5 7 z" fill="red" stroke="blue" opacity="0.3"/>
</svg:pattern>
</svg:svg>
</body>
</html>

This is very straightforward. The gradient or pattern is painted over the CSS "background-origin" area. All CSS background features are supported, such as "background-attachment:fixed" and "background-repeat". Backgrounds are striped over inline elements just like normal background images, as shown in the example. Repeating backgrounds can be a little confusing since the pattern or gradient is rendered to the background-origin area and then repeated according to CSS, so you can have repetition at the pattern or gradient level (including repetition in non-axis-aligned directions) then repetition of the rendered rectangle. But this won't matter much since unless the author is being tricky, the gradient or pattern will fill the background-origin area and only one tile will be visible.

"userSpaceOnUse" units in the paint server are defined to be CSS pixels with the background-origin area as the viewport. "objectBoundingBox" units are just the background-origin area.

I've pushed my changes to my public Mercurial repository and prepared try-server builds for Mac, Linux and Windows installer/Windows ZIP. I didn't update the branch to trunk so the rest of the code is a few weeks old. I've also published the above example. (As I mentioned a while ago the changes in this branch are gradually trickling into the trunk.)

These builds have another very cool feature which deserves a post of its own. Stay tuned!

Update The Linux build had a pretty bad bug in it (actually an existing bug that my code exposed). I've updated the link to point to a build that should work without displaying garbage much of the time.