Thursday, 31 May 2012

The Canvas getContext() Mistake

The HTML <canvas> element has a getContext(type) method that takes a string parameter identify a context type (e.g. "2d" or "webgl") and returns a context object of that type. The string parameter was introduced as an extension point to make it easier to add new context types.

This was a mistake. It would have been simpler, and just as extensible, to define a new attribute on the <canvas> element for each new kind of context. E.g. canvasElement.context2d, canvasElement.contextWebGL, and even experimental vendor-specific contexts such as canvasElement.mozContext3d. Less code to write, slightly more efficient, and easier feature detection. (If some contexts need parameters, or we consider getting a context to have more side-effects than is seemly for an attribute getter, we could have used independent methods instead, e.g. canvasElement.getContext2d().)

Of course we're stuck with getContext now for Web compatibility. The reason I mention this is because from time to time I see people trying to use getContext() as a model for extensibility in other Web APIs. Don't do that.

Actually, most of the time when I see someone trying to use getContext() as a model, they're using it because they think it gives them an escape hatch from the world of Web standards. They seem to think that it would be OK to pass in vendor-specific strings, get back vendor-specific objects, and never specify or even document the behavior of those objects. This is incorrect. It's no more acceptable to have permanently non-standard Web APIs accessed through vendor-specific strings than it would be to have them accessible through vendor-specific attributes.

Thursday, 24 May 2012

Firefox Vs The New York Times

For several months I've noticed that when my Firefox session gets bloated, about:memory shows zombie compartments associated with the nytimes.com site. I don't visit that site very often, but over a week or three a small subset of the pages that I visit there are leaked. I worried about it but didn't know how to track it down. It's hard to debug problems that arise over a period of weeks. (It's a testament to the reliability of our nightly builds though!)

Yesterday and today I discovered that by clicking around a large set of nytimes.com pages I could reproduce the problem in a reasonably short amount of time with a fresh profile and a small set of tabs open. Furthermore, with about:ccdump and some help from Olli Pettay, I was able to identify the particular element which forced the cycle collector to keep everything else alive. The next step was to figure out where we added references to that element that the cycle collector didn't know about. It's a hard debugging problem because we don't know which element is the problematic element until long after the references have been added

I tried a few approaches but tonight I finally found a successful one. I made a Windows full memory dump of the errant Firefox process, and attached to it with WinDbg. In the still-running Firefox process I then used about:ccdump to identify the address of the leaking DOM object. I was then able to use WinDbg's "s" command to search the entire memory space for occurrences of that address. Then I had to identify each occurrence. For the references the cycle collector knows about, this is easy: about:ccdump reveals the addresses of the referencing objects so you know those memory areas will contain the address of the leaking object. For other occurrences I had to dig around in nearby memory to figure out what sort of object contained the occurrence. Sometimes it's easy because there's a nearby vtable pointer and WinDbg can tell you which class the vtable pointer belongs to (using the "ln" command). Other cases were trickier. For example I found a reference to the leaking object next to a reference to an XPCNativeWrapper, in what looked like an array of similar pairs, and guessed that this was part of the hash table that maps DOM objects to their JS wrappers.

Anyway, I finally identified two references to the leaking DOM object from inside an nsFrameSelection object. These references were buried inside a copy of an nsMouseEvent that nsFrameSelection keeps for obscure reasons --- and did not tell the cycle collector about. Having identified the problem, creating a fix was easy since we don't really need to keep that copy around at all. Hopefully it'll land soon.

WinDbg has awful usability but is quite powerful and can do useful things that Visual Studio's debugger can't. I'm a bit fond of it as the distant descendant of DEBUG.COM and SYMDEB.EXE, which I spent a lot of time with while reverse-engineering and patching MS-DOS binaries (i.e., stripping copy protection from games) in my misspent youth. Too bad the syntax hasn't improved over the years!

Sunday, 20 May 2012

Crosbie's Hut

Around this time last year I took my children to the Pinnacles hut and since then I've wanted to take them on a similar trip again. The Auckland region doesn't have many huts suitable for a one-night overnight tramp --- the Waitakere and Hunua Ranges, and Gulf islands, have many good campsites but no huts --- so I set my sights on Crosbie's Hut, in the same general area as the Pinnacles, near Thames on the Coromandel Peninsula. Unfortunately it's been difficult to get away --- weekends have been very busy, or the weather forecast has been poor, or the hut's been fully booked! This weekend we finally made it.

We entered at the Tapu-Coroglen Road, a bit more than two hours' drive from Auckland, where a track heads south to meet up with tracks from the coast at Te Puru and Waimoa and then to Crosbie's Clearing and the hut. I chose this route because I thought it might be a little easier to start higher above sea level, and because I thought walking across ridges might be less muddy than going up the valleys. It was still rather muddy! I sank in up to my knee at one point which caused some anxiety in the group. The nominal time from the road to the hut was 4 1/2 hours and we took almost exactly that each way. It's very much a tramping track, not a walking track and there's a lot of up-and-down.

We had a wonderful time. The weather was great, mostly very clear, and the hut is quite new and very well designed. The trampers arriving before us had started up the hut's wood-burning heating stove which was much appreciated. The view from the hut is superb and omnidirectional. To the west you can see the Firth Of Thames, to the east Whitianga and the Pacific Ocean. The night sky was scintillating.

One delight of bringing the kids is the interest and approval shown by other trampers. I guess it's not very common for kids as young as mine to be taken tramping, but the older trampers think it's great that I'm passing it on! (Though I'm not really passing it on, I'm learning myself!)

The other trampers at Crosbie's were all very experienced so I asked them a few questions. Most huts collect rainwater in tanks, and the Department of Conservation officially encourages people to boil or treat water used for drinking, but I haven't seen anyone do that yet. One guy I asked said he's never treated water at any of the DoC huts he's stayed at (even when taking it from a stream) and nothing's happened to me yet". Hmm.

Thursday, 17 May 2012

Accelerated Scrolling In Firefox: Past, Present And Future

Scrolling performance is always important and we've made some pretty huge strides since Firefox 4. It may be helpful for me to explain some of what's happened.

In Firefox 3.6 and earlier, we had some pretty complicated code to analyze every scroll operation and figure out if we could safely scroll by just blitting some region of the window to another place and repainting the rest of the affected area. (In the very old days this was done in nsViewManager; later it was done using frame display lists.) This was quite difficult to do; for example an HTML <textarea> has a solid white background behind the scrolled text, so to prove that you can just blit the text, you have to determine that there is a solid-color background underneath the entire scrolled area. But the real problem was that it would quite often fail and we'd end up repainting the entire scrolled area, or most of it. For example, a large position:fixed element would require us to repaint the area around it. A background-attachment:fixed background would require us to repaint the entire window. This was bad because it meant scrolling sometimes "fell off a performance cliff".

In Firefox 4 we overhauled scrolling to use the new layer system. The basic idea is very simple: while scrolling, you put the content that's moving into one set of layers, and the content that's not moving in other layers. To scroll, add a translation to the transform matrix for the moving layers, repaint any parts of layers that have scrolled into view, and recomposite the window. If we do that properly, we'll never have to fall back to the "repaint everything" path; scrolling will just "always be fast". It also integrates cleanly when scrolled content uses accelerated layers for <video> and <canvas> etc.

This being the Web, there are of course a lot of complications. Separating moving from non-moving content is not easy. The infamously complex CSS z-ordering rules mean that even when scrolling a single element, you can have moving content interleaved with non-moving content so that you have to have two or more layers for each.

When you place content in these separate layers, the layers can become difficult to render. In particular text using subpixel-antialiasing placed into a moving layer, where its background is not moving, needs to be stored with separate alpha channels for each color channel ("component alpha"). This is difficult to implement efficiently. With GPU-based compositing we use shader tricks, but with CPU compositing it would be too expensive so when we encounter this situation we back off and stop caching the text in a retained layer and just draw it every time we composite the window. However even with CPU compositing, we still win on a lot of pages that use position:fixed or background-attachment:fixed.

Another thing that makes layer-accelerated scrolling difficult is when scrolled content is subject to some effect from its ancestors. The most common difficult case is when an element combines non-'visible' 'overflow' with 'border-radius' and must clip its descendants to a rounded-corner rectangle. The best way to handle this is to add support to the layer system so a container layer can clip its descendants to the rounded-rectangle border-box of an element. Until recently we didn't have that, so scrolling inside such elements was forced to repaint everything, but Nick Cameron just landed a large project to add that capability to layers and to use it in all kinds of rendering situations, including when scrolling. That means in a scrolling element that's clipping to its border-radius, the scrolled content is rendered to retained layer buffer(s) as if there was no border-radius, and then we clip to the border-radius during compositing (very efficient with a GPU since we cache the border-radius shape in a mask texture). (Nick's project provides other benefits such as accelerated video and canvas inside containers clipping to their 'border-radius' without hitting nasty fallback paths.) Summary: in Firefox 15, scrolling inside an overflow:hidden/auto element with border-radius gets a lot faster.

There is of course more to do. There are still CSS and SVG container effects we don't handle: namely filters, masks, and clip-path. Masks and filters need more layer-system support (especially filters!). Once those are done, then at least with GPU acceleration enabled it will be very difficult to hit any slow "repaint everything" paths when scrolling. (Although it's already very rare to see scrolling content inside a clip-path, mask or filter.)

The other pending important scrolling improvement is asynchronous scrolling, i.e., the ability to trigger a smooth scroll and have it happen immediately at a constant frame rate without jerkiness, even if the thread running the Web content is blocked due to Javascript execution or whatever. We've already developed most of this for Mobile Firefox (and B2G), but it needs to be made to work on desktop as well, which is not trivial. It requires enabling asynchronous compositing on all our platforms, and teaching the compositor a bit about scrolling. Once that's done, because we're able to layer-ize scrolling in almost every situation, we'll be in extremely good shape.

Tuesday, 15 May 2012

Sad And Pathetic Machines

On Saturday I visited a friend’s house to see if I could help them with slowness problems on their home computer. This was a six-year-old machine running XP with 448MB RAM. I observed that on startup Windows Update would run and while running, pretty much all the RAM in the system was consumed by Windows, wuauclnt.exe and svchost.exe (which assists Windows Update). During this time, starting Firefox or IE took minutes; the machine would thrash itself senseless. This state lasted for quite a long time, about half an hour, probably exacerbated by my attempts to get stuff done. Once it subsided, Firefox started quickly and ran well.

This is apparently a known problem and some kind of Microsoft regression.

Under these conditions, Firefox startup time and other metrics are bound to be awful.

Update I forgot to mention, but the Microsoft malware checker was also running at the same time as Windows Update and contributing significantly to resource usage. I guess it checks the downloaded and installed updates for malware...