Monday, 31 May 2010

Visiting Toronto

On Friday afternoon I'm flying to Toronto to attend PLDI and the associated LFX workshop, where I'll be giving a talk on "A Browser Developer's Wish List":

Web browser development is extremely challenging: compatibility with billions of legacy Web pages, specification and implementation of new Web-standard platform features, a wide range of devices and platforms to target, hundreds of millions of users to support, and severe security and performance challenges, all in an environment of white-hot competition. At Mozilla we use a variety of tools and processes to try to cope. This talk will analyze where they have been successful and where they fall down, with particular focus on major pain points such as nondeterministic test failures, mismatch between conditions "in the lab" and "in the field", and grappling with legacy code. We have bright hopes for the future, such as static array bounds checking and record-and-replay debugging, but other areas look dire, so I'll present my research wish-list --- including better performance analysis tools, verified refactorings, and static analysis packaged as assistance for code reviewers.

I arrive on Friday night, plan to attend workshops on Saturday and Sunday, PLDI itself Monday through Wednesday, and then on Thursday and Friday I plan to be in the Mozilla Toronto office, possibly with a side trip to the University of Toronto. I fly out on Friday afternoon, hopefully arriving home on Sunday morning. It'll be a great chance for me to reconnect with the research community I used to be part of, and catch up with a lot of old friends and colleagues. I'm really looking forward to it!



Dear Conservatives

One of the things that bothers me about "conservative" political movements is that in spite of their advocacy of self-reliance, private enterprise and limited government (which I agree with!), in times of trouble their members tend to look to the government to solve problems. It happens in economic crises and national disasters, it happens with trade and farm subsidies, and now it's happening again with the BP oil spill. Private enterprise created the problem, surely private enterprise can solve it? Why are conservative pundits blaming the federal government for not solving it?

Yeah, I know why.



Tuesday, 25 May 2010

Nero Vs MPEG-LA

Nero's complaint against the MPEG-LA is interesting. It seems to be primarily about the MPEG-LA deciding that Nero needs to pay H.264 license fees for "free trial" copies ... an interesting case study of the risks of murky license agreements. But there's a side antitrust issue. They dug up a Department of Justice memo that OK'ed the formation of the MPEG-LA on the condition that the patents in the MPEG-2 pool were limited to "essential" ones by having an independent expert verify that the patents were essential. The MPEG-LA chose as the independent expert their own co-founder and general counsel. Oops! Also, after telling the DoJ that 27 patents constituted "most" of the essential patents in the MPEG-2 pool, they proceeded to add 800 more. (The H.264 pool has even more.)



Saturday, 22 May 2010

Ip Man Fail

I just finished watching Ip Man on DVD. It was a good movie, but the subtitles were literally the worst I have ever seen in 20 years of watching many, many subtitled Chinese movies. I'm accustomed to confusing, silly, stilted, and absurd subtitles, but these were simply gibberish. My wife provided a running translation of the Cantonese but the Japanese left us both bemused.

It's hard to convey the awfulness ... None of the characters were named; the names were just translated into English and rendered without capitals, so the protagonist is referred to variously as "leaf teacher" and "leaf ask", his son is rendered as "leaf quasi", another character is just "wood", a villain is "three general", but some characters get names like "LIU2 BAO3". Word order is basically completely random. Martial arts challenges are described as "exchanging views" or "vexing". People kept saying "are you dry" --- I have no idea what that was about. Forget about punctuation.

I'm actually curious about the process that could lead to such terrible results. There's must be fifty million people who could do a better translation, not to mention Google Translate, so how did they end up with someone doing it this badly? Maybe the DVD I borrowed is actually an illegal copy, but why would they make their own subtitles? And surely there are Laotian street kids who can do a better job than this.



Friday, 21 May 2010

WebM

It's a relief to be able to talk about the new WebM video format. Development of WebM support in Firefox has all happened in our Auckland office (with some last-minute build support from Nick down in Timaru, and some quick hacking by Justin Dolske to make the context menu useful on Youtube). It's been crazy and stressful, but it's also been really good because WebM is going to be great for the open Web. There are several major advances here for unencumbered video on the Web:


  • Youtube. Youtube supporting WebM is psychologically massive. It also means we won't have to choose between our principles and being the only browser that needs plugins to play Youtube.
  • Better video quality. There's a lot of noise from H.264 proponents about Dark Shikari's analysis of VP8, but even he agrees it's better than H.264 Baseline, which is pretty much all H.264 on the Web today (partly because that's all most current H.264 hardware supports). We also will also make ongoing improvements, perhaps by merging in some of the work done for Theora.
  • Hardware support. Several companies have announced support for VP8 in their hardware.
  • Flash. Support for WebM in Flash may actually be the most important thing for authors and for the success of unencumbered video. It means authors will be able to publish in one format, WebM, and it will work everywhere (except the iPhone/iPad I guess!). Not only is that great for authors, it will put great pressure on the holdouts to add native WebM support. Thanks Adobe!

And thanks Google. Spending $120M on On2 and then releasing their IP assets under a BSD license is hugely appreciated.

It's important to remember that as exciting as this is, it's only the beginning. We expect significant improvements to the VP8 encoder and decoder. For Gecko, we need to get our current patches into mozilla-central, we need to fix some bugs, and we need to add a few desperately needed features to our <video> element support, such as the buffered attribute, plus a generic fullscreen API.

By the way, there's an important lesson here for all the people who were counseling us to give up on unencumbered video as a lost cause and embrace the MPEG-LA. It's a mistake to give in to despair when you don't know the future (and the world being an unpredictable place, you very rarely know the future).



Wednesday, 12 May 2010

Discontent On The Web

I just read Sachin Agarwal's post titled "The web sucks. Browsers need to innovate" and my head exploded. But I can see that he's not alone in his views.

His basic thesis --- at least, the part that makes my head explode --- is that standardization efforts are slowing down innovation on the Web and therefore browsers should just provide whatever APIs they want and make no effort to standardize. Web authors should target particular browsers and then market pressure will force other browsers to implement those APIs.

Well, we tried that and it didn't work. It was called IE6. There were several problems, but the main problem is that the last step --- post-facto cloning of APIs based on reverse-engineering of the dominant browser --- is absolutely horrible in every way. It's expensive, slow, error-prone, and leads to a crappy platform because the dominant browser's bugs *are* the standard so you're stuck with whatever insane behaviour the dominant browser happens to have.

It's also highly prone to creating monopolies due to network effects. Agarawal supposes that users will just install several browsers and view each site using the browser it works in. But that won't happen due to corporate policies, platform limitations (how many browsers can you use on the iPad, again?) and sheer inconvenience. Inevitably one dominant browser will rise, the others will find it impossible to keep up with the reverse-engineering, and it will be IE6 all over again. Keep in mind that if Firefox --- and standards --- hadn't broken down the IE6 hegemony, Safari on the iPhone would be useless and the iPhone probably wouldn't be where it is today.

This sentiment is ill-timed. For a long time we had to burn a lot of energy to recover from the damage of the IE6 era. Now we are really in a new era of innovation on the Web. There is far more investment in browsers than there ever was before, and we're expanding what Web apps can do faster than ever. Agarwal complains

Web applications don't have threading, GPU acceleration, drag and drop, copy and paste of rich media, true offline access, or persistence.

but we've already shipped HTML5 offline apps, drag and drop, and Web Workers in Firefox, we've got WebGL and GPU-accelerated everything in the works (some already in nightly builds), and WebSimpleDB is coming along for persistent storage.

One area where we do have a problem is ease of development, especially tools. Agarwal is right about that.

Parting shots:

Right now browser updates fix bugs and add application features, but can't enhance the functionality of the web. This is only done by standards boards.

This is just incorrect. Browsers add functionality on their own initiative all the time --- hopefully prefixed so that developers know it's non-standard.
Browsers are forced to implement every "standard" that is agreed on, even if it's not the best decision for the platform.

Not at all true. Right now I'm explaining to people that we don't want to implement SVG fonts because they're almost entirely useless.
Browsers don't add functionality outside of standards because developers wouldn't utilize them. This means they can't innovate.

Untrue, see first point. But we get the best results when standards are developed in parallel with implementations.
Browsers don't even comply with standards well. Developing for the web is a disaster because every browser has its own quirks and issues. They can't even do one thing right.

But it's constantly getting better, especially as old Microsoft browsers go away. Plus you can't complain about cross-browser compatibility and advocate targeting one browser at the same time.
When GMail launched in 2004, it took one step forward and 10 steps backwards from the mail application I was using. Even today, the major features GMail is releasing are simply trying to match the features I've had on the desktop for years.

And yet, people have migrated to Web-based email en masse. Why is that?
I think this is the tipping point for the web. The modern web had over 10 years to reach parity with desktop applications, and it couldn't even hit that. Now it faces extinction as innovation in native applications accelerates.

The Web didn't exactly fail while it was miles behind native apps. Now it's a lot closer and moving a lot faster.


Standardizing XUL Flexible Boxes In CSS

Tab Atkins is working on a new draft spec for XUL-style "flexible boxes" in CSS. One issue that has come up is whether the XUL concept of "preferred widths" is essential or not. It would be really useful if XUL developers could contribute to the thread in www-style with feedback on that and other issues in related threads.

More detail: the children of a display:-moz-box element can have both width and -moz-box-flex CSS properties set on them. The width property sets a preferred width for each child. Any leftover space in the container is allocated to the children in proportion to their -moz-box-flex values. Likewise, if the preferred widths add up to more than the width of the container, the excess space is trimmed from the children in proportion to their -moz-box-flex values.

An alternative, simpler model would be to not have preferred widths: just size each child proportionally to its flex value. min-width and max-width could still be honoured. It's unclear whether this simpler model is too simple.

So if you have something to say about how you use flexboxes, now's the time to contribute and ensure your needs are addressed!



Tuesday, 4 May 2010

CGLayer Performance Trap With isFlipped

For the last few days I've been working on maximising performance of my new code for scrolling with "retained layers". Basically it's all about fast blitting: I want to quickly "scroll" the pixels within a retained layer surface, and then quickly copy that surface to the screen/window. This will have many advantages in the long run, but initially I have to work hard to make it as fast as the current scrolling approach for simple cases, because in simple cases the current approach is to call platform API to move pixels on the screen and then repaint the strip that scrolled into view, and that's hard to beat.

Copying pixels around quickly within a surface is a bit of a problem in cairo. cairo currently doesn't specify what should happen if a surface is both the source and destination of a drawing operation, and at least pixman-based backends do weird things in many of these cases. I think unspecified behaviour is bad, and cairo should just define this case so that the "obvious thing" happens --- it's as if you made a copy of the surface and used that copy as the source while drawing into the surface. It's not hard to implement that way for the general case, and for specific cases we can optimize self-copies quite easily to avoid the temporary surface (e.g., when the self-copy is a simple integer translation of the surface contents). So we can fix the self-copy problem in cairo.

My bigger problem has been wrestling with the OS X Core Graphics APIs. Currently cairo-quartz uses CGBitmapContexts for its surfaces, because they give us easy direct access to pixel data and they're easy to use as CGImages if we draw one surface to another with tiling. However, Apple docs enthusiastically recommend using CGLayers for improved performance. Indeed the QuartzCache example shows a significant performance boost from using CGLayers instead of CGBitmapContexts. So I've got a patch that adds to cairo-quartz the ability to create surfaces backed by CGLayers.

Unfortunately, even these CGLayer surfaces don't make scrolling as fast as I want it to be. Shark profiling shows that we spend 20% of the time in the self-copy, moving pixels in the CGLayer, and then 60% of the time actually copying that CGLayer to the window. That sounded a bit wrong since the whole point of CGLayers is that you can efficiently blit them to the window. So I looked closer at the profile and noticed the CGLayer copy time is all in argb32_image_mark_rgb32, while if I profile the QuartzCache CGLayer example (modified to more closely emulate what we do when scrolling), copying the CGLayer to the window uses sseCGSBlendXXXX8888 (via CGSBlendRGBA8888toRGBA8888). Googling, plus inspection of the machine code of those functions, shows that argb32_image_mark_rgb32 is a fairly nasty slow fallback path, and CGSBlendRGBA8888toRGBA8888 is the really fast thing that we want to be using. So the question remains, why are we getting the slow path in my layers code while the QuartzCache example gets the fast path?

This was really painful to answer without CoreGraphics source code. I did some reverse engineering of argb32_image, but it's a huge function (20K of compiled code) and that wasn't fruitful. Instead I wrote experimental code, and eventually just wrote some code that creates a layer for the CGContext we obtain from [[NSGraphicsContext currentContext] graphicsPort] in our NSView's drawRect method, and immediately blits that layer to the context. Still slow.

Clearly there's something wrong with the state of the CGContext of our NSView. But how does our NSView set up its context differently from the QuartzCache example? Then I recalled that we return YES for isFlipped in our NSView to put it into the coordinate system other platforms expect --- (0,0) at the top left. So I tried returning YES for isFlipped in the QuartzCache example --- bingo, it slows right down and takes the argb32_image_mark_rgb32 path. In fact it looks like returning YES for isFlipped slows down a lot of the APIs used in QuartzCache...

Conclusion: for high performance graphics on OS X, avoid isFlipped. Or something like that. It's fairly bogus that adding such a simple transform to the CGContext would hurt performance so much, but so it goes...

I'm not quite sure how we're going to fix this in Gecko yet. I'll be brainstorming on #gfx on IRC tomorrow!