Thursday, 25 June 2009

Native Widgets Begone

Recently I've been working removing the use of native widgets (Windows HWNDs, Cocoa NSViews, GdkWindows) in various places in Gecko. Native widgets get in the way of painting and event handling, and break things due to various platform limitations.

The key difficulty of getting rid of native widgets is ensuring we still handle windowed-mode plugins adequately. For example, currently if a windowed-mode plugin is contained in an overflow:auto element we rely on a native widget associated with the element to scroll and clip the plugin. In situations where we should clip but there is no native widget present, like 'clip' on an absolutely positioned element, we actually don't currently clip the plugin, which is pretty bad.

My new strategy for clipping plugins is to explicitly compute a clip region for each plugin and use native platform APIs to clip the plugin's widget to that region. On Windows we can use SetWindowRgn, on X we can use the XShape extension via gdk_window_shape_combine_region. On Mac we don't have to do either of these things, because plugins work a bit differently there. One issue of course is figuring out what the clip region should be. For that, I'm reusing our display list machinery. We build a display list for the region of the page that contains plugins, and then traverse the display list to locate each visible windowed-mode plugin --- its position, and what clipping is affecting it. Plugins that aren't in the display list aren't visible so we hide them. This works really well, and means that everywhere we clip Web content, plugins are automatically affected as well.

A nice bonus is that display lists contain information about which elements are opaque. We already have the ability to determine that part of an element is covered by opaque content and thus doesn't need to be drawn. We can leverage this for plugins too: areas of a plugin we know are covered by opaque content (e.g. a DIV with a solid background color) are clipped out. This means that all the crazy hacks people have been using to get content to appear above a windowed-mode plugin today (e.g., positioned overflow:auto elements and IFRAMEs) will no longer be needed with Gecko; you can just position a DIV or table with a solid background color. The hacks still work of course, as long as the hacked element has a solid background. (Credit where it's due: IIRC clipping holes for opaque Web content out of windowed plugins was first suggested by a KHTML developer long ago.)

This also fixes relative z-ordering of plugins with other plugins and Web content. We don't have to touch native widget z-ordering because windowed-mode plugins are opaque. So if plugin A is supposed to be drawn in front of plugin B, but its widget is behind plugin B's widget, then we effectively punch a hole in plugin B's widget so you can see plugin A through it. It sounds a little crazy but it works just fine.

Perhaps the greatest challenge here for us is getting scrolling to work smoothly on all platforms. In general we may have an IFRAME or overflow:auto element in the page, which contains one or more plugins which need to move smoothly with the scrolled content. A naive implementation would just bitblit the Web content and then move and clip the plugin widgets, creating a variety of exciting and unwanted visual artifacts. I designed a new cross-platform scrolling API which is quite rich; you pass in the rectangle you want to scroll, the amount you want to scroll it, and a list of commands to reconfigure child widgets that should be performed with the scroll "as atomically as possible". These commands let the caller change the position, size and clip region of some or all child widgets. This gives the platform-specific code great freedom to reorder operations. I described some of the implementation issues on X in a previous post.

I think I've got things working pretty well. My patches pass tests on all platforms on the try servers. There are 24 patches in my patch queue, many of them small preparatory cleanup patches. The big important patches enable region-based plugin clipping, modify a ton of code to remove the assumption that every document has a native widget as its root, actually remove the native widgets associated with content IFRAMEs, and remove the native widgets used for scrolling and implement the new scrolling system.

In order to not change too much at once, I've left some things undone. In particular "chrome documents" like the Firefox UI still use native widgets in quite a few places. There's a lot of followup simplifications that we can do, like eliminate the "MozDrawingArea" abstraction that we currently use to support guffaw scrolling. In fact, apart from fixing a ton of plugin and visual bugs, the main point of this work is to enable further changes.

For people who are interested, I've got some try-server builds available for testing. I'm particularly interested in IME and accessibility testing, since although the code passes tests I suspect external IME and AT code may be making assumptions about our native widget hierarchy that I'm breaking. General scrolling and plugin performance is also worth looking at.

Whitespace Begone

I just landed on trunk a small improvement to page-load performance. Typical HTML documents contain a lot of markup like

<b>Hello World</b>

In the DOM, there are four text nodes which contain only whitespace. Up until now, in Gecko we created one "text frame" to render each text node. However, this is a waste, since whitespace-only text nodes before or after a block boundary don't affect the rendering; they are collapsed away (unless CSS white-space:pre or certain other unusual conditions apply).

So I did some measurements of our page-load-performance test suite (basically, versions of the front pages of popular Web sites) with an instrumented Gecko build to see how much could be saved if we avoid creating frames for those whitespace DOM nodes. It turns out that in those pages, 65% of all whitespace-only text frames were adjacent to a block boundary (i.e., the first child of a block, the last child of a block, or the next or previous sibling is not inline) and don't need a frame. That's 30% of all text frames, and 11% of all layout frames!

Actually suppressing the frames was pretty easy to implement --- it helped that Boris Zbarsky has recently done some magnificent hacking to make nsCSSFrameConstructor more sane. Handling dynamic changes correctly was a little tricky and actually took most of the development time. Our test suite, which is pretty huge now, caught some subtle bugs. But the patch is on trunk and seems to be a 1-2% improvement in average page load time. It's also a small reduction in memory usage. It also makes dumps of the layout frame tree much easier to read because a lot of pointless frames are gone.

What's probably more important than all that is that this makes it easier to separate our implementation of blocks into "blocks that contain only blocks" and "blocks (possibly anonymous) that contain only inlines" without regressing performance, because we won't have to create anonymous blocks just to contain irrelevant whitespace.

Tuesday, 23 June 2009

DirectShow And Platform Media Frameworks

People keep asking why we don't integrate support for Windows' DirectShow into Firefox so the <video> element can play any media that the user has a DirectShow codec for. Even if a volunteer produces a patch, I would not want to ship it in Firefox in the near future; let me try to explain why.

  1. Probably most important: we want to focus our energy on promoting open unencumbered codecs at this time.
  2. Only a very small fraction of Windows users have a DirectShow codec for the most important encumbered codec, H.264. Windows 7 will be the first version of Windows to ship with H.264 by default. Even if millions of people have downloaded H.264 codecs themselves, that's a very small fraction of our users.
  3. DirectShow is underspecified and codecs are of highly variable quality. Many codecs probably will not work with Web sites that use all the rich APIs of <video>, and those bugs will be filed against us. We probably will not be able to fix them. (Note that the problem is bad enough that in Windows 7, Microsoft isn't even going to allow unknown third parties to install DirectShow codecs.) We could avoid some of this problem by white-listing codecs, but then a lot of the people who want DirectShow support wouldn't be satisfied.
  4. Many DirectShow codecs are actually malware. ("Download codec XYZ to play free porn!")
  5. DirectShow codecs are quite likely to have security holes. As those holes are uncovered, we will have to track the issues and often our only possible response will be to blacklist insecure codecs, since we can't fix them ourselves. If we blacklist enough codecs, DirectShow support becomes worthless.
  6. Each new video backend creates additional maintenance headaches as we evolve our internal video code.

So even if we didn't care about promoting unencumbered codecs and someone gave us a working patch, shipping DirectShow support in Firefox is of limited value and creates tons of maintenance work for us.

Some (but not all) of the same arguments apply to using Quicktime. But if we're going to ship our own video infrastructure for Windows, we save ourselves a lot of trouble by using that infrastructure across all platforms. It saves authors a lot of trouble too, due to less variability across Firefox versions.

Currently no browser or browser plugin vendor is using codec implementations they don't control. Apple does allow third-party codecs to be used with Quicktime in Safari --- but at least they control the framework and the important codecs, and apparently they have made some changes for Web video.

Friday, 19 June 2009

The Price Of Freedom

With the imminent release of Firefox 3.5 and the big step forward for unencumbered video and audio that this represents, there's been a lot of discussion about the merits of the free Ogg codecs vs the flagship encumbered codecs, especially H.264. Proponents of the encumbered codecs tend to focus on the technical advantages of patented techniques used in H.264 but not in Theora. But the real question that matters is this: at comparable bit rates, in real-world situations, do normal people perceive a significant quality advantage for H.264 over Theora? Because if they don't, theoretical technical advantages are worthless.

(It's not helpful to ask a video compression expert if they can see quality differences. Of course they can, they're trained to. That doesn't tell you what matters for the other 99.9999% of the population.)

One issue that has made a comparison difficult is that encoders are so tunable. People always say things like "oh, your H.264 would look much better if you set the J-frame quantization flux capacitor rank to 8". So Greg Maxwell had a stroke of genius and did a comparison using Youtube to do the encoding. Now, people who think the H.264 encoding is sub-par are placed in the difficult position of arguing that Youtube's engineers don't know how to encode H.264 properly.

In fact Youtube deliberately doesn't squeeze the most out of H.264 encoding. That's because in real life there are tradeoffs to consider beyond just quality and bit-rate, such as encoding cost and bit-rate smoothing. But that's fine, because these are the real-life situations we care about.

Maik Merten recently repeated Greg's experiment with different video in larger formats. In these tests, it seems pretty clear that there is no real advantage for H.264, or even that Theora is doing better.

We really need someone to do a scientific blind test with a wide pool of subjects and video clips (hello academic researchers!). But H.264 proponents have to demonstrate real-world advantages that justify surrendering software freedoms and submitting to MPEG-LA client and server license fees (not to mention the hassle of actually negotiating licenses and ensuring ongoing compliance).

Tuesday, 16 June 2009

Stupid X Tricks

Platforms like X and Win32 support trees of native windows. For example,
an application might have a "top-level" window containing child windows
representing controls such as buttons. In browsers, child windows are required
for many kinds of plugins (e.g., most uses of Flash). Managing these child
windows is quite tricky; they add performance overhead and the native
platform's support for event handling, z-ordering, clipping, graphical effects,
etc, often is not a good match for the behaviours required for Web content,
especially as the Web platform evolves. So I'm working hard to get rid of
the use of child windows and I've made a lot of progress --- more about that
later. However, we're stuck with using child windows for plugins, and recently
I've been grappling with one of the hardest problems involving child windows:

It's very hard to make scrolling smooth in the presence of child windows
for plugins, when all other child windows have been eliminated. In general
you want to scroll (using accelerated blitting) some sub-rectangle of the
document's window (e.g., an overflow:auto element), and some of the plugin
windows will move while others won't. You may want to change the clip region
of some of the plugins to reflect the fact that they're being clipped to the
bounds of the overflow:auto element. The big problem is that to avoid flicker
you want to move the widgets, change their clip regions, and blit the contents
of the scrolled area in one atomic operation. If these things happen
separately the window is likely to display artifacts as it passes through
undesired intermediate states. For example, if you blit the scrolled area
first, then move the plugins, the plugin will appear to slightly lag
behind as you scroll the window.

On Windows, href="">
ScrollWindowEx with the SW_SCROLLCHILDREN flag gets you a long way. But
on X, the situation is dire. X simply has no API to scroll the contents of a
window including child windows! Toolkits like GTK resort to
heroic efforts such as

guffaw scrolling
to achieve the effect, but those techniques don't let
you scroll a sub-rectangle of a window, only the entire window. So for Gecko
I have had to come up with something new.

Basically I want to use
and have it copy the contents of the window *and* the contents
of some of the child windows in the scrolled area, and then move the child
windows into their new locations and change their clip regions (I'm using
XShape for this) without doing any extra repainting. I tried a few things that
didn't work, before I hit a solution:

  1. Hide (unmap) all child windows that are going to be moved during the scroll
  2. Do the XCopyArea
  3. Set the clip region and position of all relevant child windows
  4. Show (map) all child windows that we hid in step 1

By hiding the child windows during the XCopyArea, we trick the X server into
treating the pixels currently on-screen for them as part of the parent window,
so they are copied. By moving and setting the clip region while each child
window is hidden, we also avoid artifacts due to a child window briefly
appearing outside the area it's supposed to be clipped to. It *does* mean that when we scroll a window containing plugins, expose events are generated to repaint the contents of the plugins' child windows, so scrolling is slower than it could be. We might be able to avoid that by enabling backing store for the plugin windows.

It's somewhat tricky to implement this. We have to compute the desired
child window locations and clip regions, store the results in a collection,
and pass that collection into our platform window abstraction's Scroll() API
so that the scrolling and widget configuration can happen as a unified
operation. But I've done it, and it works great on my Ubuntu VM. I'm not
100% sure that it will work well on all X servers and configurations; that's
something we'll need to get good testing of.

I do wonder why X has never felt the need to grow a genuinely useful
scrolling API. Oh well.