Thursday, 22 April 2010

Kingitanga Day

Today I drove down to Hamilton for their Kingitanga Day celebrations. I was honoured to be invited to give a talk as one of the many events of the day. I gave another talk about Changing The World, but quite different from last week's keynote. In this talk I focused on Mozilla's efforts to change the world, and how we work as an open source community to make changes. I went on to talk about how and why people should contribute to open source communities in general. Most of the stuff in the talk was obvious to people who are familiar with Mozilla. I had a bit less time than usual so I had to race through the second half --- I hope that didn't put people off!

After my talk was a talk by Stu Sharpe from Sidhe, a games development company in Wellington, about game development. It was pretty interesting --- nothing super surprising, but it's good to know that kind of work is being done in NZ.

Monday, 19 April 2010

Changing The World

Last Tuesday I gave a keynote at the NZCSSRC at Victoria University in Wellington. I was a bit nervous because I've never given a keynote before. From my point of view, it went well --- I felt good while I was giving it, and afterwards several people talked to me having obviously reflected on what I'd said and how it applied to their own work. However, I never trust my own feelings too much so I'm not 100% sure how it went down :-). I've got slides here.

I kicked off my talk with the parable of the talents. OK, that's not something you expect to see at a computer science conference, but this was a keynote so I took some extra latitude :-). I pointed out that the word "talent" originally was a unit of mass, particularly precious metals, and it acquired its current meaning due to this parable. Thus, for as long as it's had its current meaning, it has been associated with the idea that with talent God also gives responsibility to make the best use of one's talent. I emphasized that this is as true for computer science talent as for other kinds, perhaps even more so because computer science has this wonderful property that we can deploy incredible functionality at near-zero marginal cost. I pointed out that if you make a change to Firefox to save one second per user per day, it's like saving three thousand lives. The rest of my talk was about ways to maximise one's impact on the world.

I distinguished "research" from "development" by defining "development" as building something that will be used in practice, while "research" creates and transmits knowledge that will help others build something practical. (These are not mutually exclusive for a given project --- development projects almost always create and transmit knowledge too.) Obviously, only development has direct impact, but pure research can be useful because it lets you drop constraints, so you can understand problems more clearly and iterate faster on solutions. I think characterizing research based on which constraints have been dropped is a very good way to understand the place of your work in the world. Many factors influence people's choices about whether to do research or development. It's simpler to have impact with development. However, research offers more crisp intellectual problems because you can drop ugly constraints, and it also lets you publish more because you can iterate faster.

Many people think that solving problems is the hard part of research, but in my view, problem selection is the hardest part of research. Most research I see won't have an impact even if it's completely successful because of poor problem selection. The ideal problem is crisp and intellectually satisfying, yet a solution would have immediate impact. Such problems are rare but since the space of problems is so large, they do exist! Find them!

I brought up program slicing as a negative example. Since 1984 hundreds or thousands of papers have been written about program slicing, but in reality it is almost never used. I believe the research in this area has been completely pointless. Probably hundreds of millions of dollars have been wasted, not to mention enormous amounts of the time of very smart people. This is criminal misuse of resources. It's important to note that just because large numbers of smart people are working on a problem, it's not necessarily a good one!

I observed that most researches choose problems by first having a technology or skill in mind and then looking for "applications" --- problems that their technology might solve. I call this the "solution-driven" approach to problem selection. I used it throughout my career. I believe it's a huge mistake. At the outset you restrict the problem space to those problems that look like they might be a fit for your technology, you rank those problems by how good a fit they are to your technology instead of how important they are to the world, and even then you run the risk that your technology is not the best solution for those problems.

A better approach is "problem-driven": start by identifying important problems, and then find the best solution to them. It sounds obvious when you say it that way, but it's not what most researchers do :-). For example, we know that debugging is an incredibly important problem because most programmers spend most of their time doing it. So one could analyze the problem of debugging, and figure out how to make debugging faster or cheaper, or how to avoid those bugs being present in the first place. Do this without having preconceived ideas about what the solution should be. Use "Wizard of Oz" techniques to evaluate solutions before creating them --- e.g., use human intelligence to pretend to be a tool and see if the results are helpful, before you build the tool. You'll be able to iterate much faster and you won't be emotionally invested in proving that the tool is useful.

The downside of the problem-driven approach is that often the solutions will demand expertise you don't have. You may be required to learn something new. I think that's OK. You can also collaborate, or even hand the problem off to someone else and retry problem selection.

I went on to give some tips about publishing. If you want to do research with impact you should publish in top conferences and journals because the lesser ones are ignored. You often see research in top conferences that actually repeats work previously published in lesser conferences, because people didn't know about the latter ... and you find that the later top-conference papers end up more widely cited. Sad but true. Publishing in top conferences is not that hard if you know how to play the game and you have selected good problems. Read lots of proceedings and journals to understand what kinds of papers are accepted at the conference and how they should be written. Don't be discouraged by rejection --- paper reviews are very random. Try to choose problems that are amenable to compelling evaluation; for certain kinds of problems, it's very difficult to prove that you solved them, for others it's easy.

I ranted a bit about negative results. We're not surprised when we build something and it works --- if we hadn't expected it to work, we probably wouldn't have done it. Thus we are surprised when it doesn't work. Surprising results are more interesting, therefore negative results are more interesting, especially if the failure was for an interesting reason, rather than "it was too hard to solve" (although that can also be interesting). Unfortunately, you generally can't publish negative results; the research community is just broken this way. I would like to see a Journal Of Negative Results, but people would probably be afraid to publish there.

I ranted about corporate research. Corporate research labs are started in times of plenty as vanity projects --- "research is awesome, we're awesome, let's do it". But genuine research (by my definition above) hardly ever benefits the company doing it --- it should be thought of as corporate philanthropy --- so over time, to justify themselves, the labs do some amount of development as well. Unfortunately, artificially separating that development from the development done by "product" groups creates problems of "tech transfer". Those problems would not exist if you simply put those people into product teams. This is more or less what Google does, as I understand it. (However, there can be tactical advantages to having a separate lab; it might let you attract and keep smart people you couldn't otherwise get.)

I proceeded onto the most controversial part of my talk: how to improve the quality of computer science research done in New Zealand. Top people want to work with other top people; not only top peers in their area, but top collaborators in other areas if needed. Thus, groups of top people attract more top people --- students, researchers and faculty. This is one reason why the rankings of the best research universities are so stable. Therefore I proposed collecting the best researchers in New Zealand into a single institution; this would be more effective at attracting top researchers to New Zealand --- and at keeping them here. It's not a zero-sum game. There would also be an education benefit: the majority of students are interested in vocational training, not computer science, and they could be directed to other institutions, so this elite institution could focus on teaching computer science to the small number of students who want to learn it. This would be very good for the minority of employers like me who need graduates with hard-core computer science.

Surprisingly no-one tried to shoot me down on stage, but I got some feedback later :-). Some good issues were raised, but I still think it would be desirable. Of course it would be politically extremely difficult to make happen.

I concluded by talking more about development. The common idea that research is more intellectually challenging is false; generally, dropping constraints makes problems easier. Research simply gives you more freedom to ignore uninteresting problems. But there are development jobs that are very interesting with huge impact!

Megacorporations are horrendously inefficient. Large numbers of people work on many projects that make no sense at all. If you go to work for a megacorporation, make sure you're going to work on a specific project that you know will have impact. Otherwise, go for a small organization; they're much more efficient so your work will not be diluted. Consider contributing to open source projects; they usually have immediate impact, they let you disseminate your work widely, and you can learn a lot from them.

Other things being equal, choose projects lower on the software stack. Putting parts together has less influence and impact than designing the parts. The latter affects more users too. Other things being equal, choose projects with more users (or potentially more users), since impact scales with the number of users.

Obviously, we should be striving to have positive impact, not negative impact! If you are regularly embarrassed by your employer, leave. Computer science people are fortunate to have many opportunities to work for employers who will not embarrass us.

Yeah, it was fun. On Wednesday I'm going down to Hamilton to give a similar talk at the University of Waikato for Kingitanga Day. I'll probably talk less about how people can have impact through research and more about how I'm having impact at Mozilla and its open source community, and how other people can have impact through open source.

Saturday, 17 April 2010

Accelerating Theora On The N900

Last year Mozilla funded development of Leonora, David Schleef's port of the Theora decoder to run on the TI C64x+ DSP that's found in the Nokia N900, Palm Pre, Motorola Droid and other mobile devices. Since then Matthew Gregan's spent some time working on integrating it into a complete player with actual video playback, sound, and A/V synchronization. The bottom line is that using Leonora and certain graphics acceleration tricks we can play full-screen video (800x480) with sound at 26 frames per second with the CPU 80% idle. This shows that free codecs can work well on mobile devices. Read Matthew's blog post for the juicy details.

API Design For The Masses

On Monday I gave a talk at the Victora University computer science department which I called "API Design For The Masses". I've put the slides online. The premise is that designing specifications for the Web requires you to make different decisions than in the "good old days" when a relatively small number of professionals were writing specifications and implementing both sides of those specifications. I give eight design principles for Web-scale specs, mostly based on examples.

The contents should not be surprising to anyone involved with the WHATWG. I didn't invent these principles, I just wanted to summarize my version of some of what the community already knows.

These are my principles:

  1. Try to define behaviour for all inputs. Avoid "undefined" behaviours. Simplify or eliminate "fatal" behaviours.
  2. Specs that contradict entrenched practice must be changed to match it. If following the spec harms users, that part of the spec is worthless.
  3. Standardization and implementation must happen concurrently. You need a fast feedback loop, which means you need to be able to update the spec frequently and with low latency.
  4. Provide non-standard/pre-standard extensions, but use syntax to ensure authors know they're non-standard.
  5. Favour evolution over revolution if possible.
  6. Aim for forward and backward compatibility (with graceful degradation). Avoid versioning.
  7. Encourage conditional content to use detection of features, not detection of implementations. Provide mechanisms for such conditionals.
  8. Only add features where authors can see and correct their mistakes. Invisible metadata doesn't work.

Friday, 16 April 2010

Fermi, Darwin, and Christians

Via Slashdot, this article attempts to explain Fermi's "paradox" but completely fails to do so. Instead it predicts that neo-Luddite Christians and other social conservatives will inherit the earth, which I'm guessing was not the writer's original thought. Bravo for publishing it anyway! Most amusing.

Saturday, 10 April 2010

More Apple Evil

Chris Double just pointed out to me the new iPhone development agreement terms that require an application's original source language to be C, C++ or Objective C (plus maybe JS running on Webkit). Basically they're outlawing compilers like Adobe's Flash-to-iPhone compiler, Innerworks' J2ME-to-iPhone compiler, the Mono iPhone compiler, etc. They're also forbidding compatibility layers that wrap Apple's APIs.

This is mind-boggling. There's no possible technical justification for this. It's pure evil. I'm not even sure why Apple would do this. Maybe to make it harder for people to write cross-platform apps that run on the iPhone?

The lesson for those companies that just got hung out to dry is simple: don't build on locked-down platforms.

Thursday, 8 April 2010

Upcoming Events

My next couple of weeks are going to be rather busy with public speaking engagements...

Tomorrow afternoon I'm going to be on a panel at the ASWEC 2010 software engineering conference at Massey in Auckland. The panel's about "Engineering Software For Economic Growth", which sounds like it could be boring but I'll try to liven it up :-). I'll claim that although most software development is repetitive and lends itself to predictable engineering-style processes, innovative projects are more inventive and inherently less predictable, and the latter are more economically desirable in the long run.

On Monday I'll be down at Victoria University in Wellington to give a talk on "API Design For The Masses". This will be about what we've learned designing and implementing Web specs. Designing formats, protocols and APIs that will be used by millions of people (e.g. HTML) poses fundamentally new requirements over designing for an audience of hundreds or thousands of people (e.g. TCP/IP). Another title for this talk could be "Extending The Web Platform For Dummies".

On Tuesday through Thursday I'm going to be at the New Zealand Computer Science Student Research Conference in Wellington. I'll be mostly offline during this time. On Tuesday morning I'll give a talk about "changing the world". This will be an exhortation for people to apply their computer science skills to solve interesting problems that matter --- encouraging them to avoid the pitfalls of boring corporate software development and futile ivory-tower research. I want to demolish the idea of "academia versus industry": you can do boring work, or interesting work, in either environment. Problem selection is everything.

On Wednesday the following week I'm going down to the University of Waikato to give a talk on Kingitanga Day. This talk will be a variant of the previous talk, with more of a focus on working in open source communities to change the world, especially in the context of Mozilla.

Monday, 5 April 2010

Big Week

It's been quite a big week in Gecko-land!

Bas Schouten landed his OpenGL layers backend and we're using it to accelerate full-screen video on Windows.

David Baron landed his fixes to to the :visited privacy leak.

Chris Pearce landed the new Ogg decoder to get rid of some nasty problems and make forward progress faster.

Josh Aas turned on out-of-process plugins for Mac.

Of course, lots of other people made huge contributions to these projects.


I took the kids out camping at the Craw campground near Anawhata for a couple of nights. The weather was perfect and we had a lot of fun. We walked down to Anawhata beach on Friday night, and on Saturday we walked down to and around Piha. One of the great things about camping over day hikes is that you can be out and about at dawn and dusk.

Craw campground is one of the new campgrounds for the Hillary Trail. After dark on Friday night a large number of teenage girls descended on the campground, completely exhausted after hiking all the way from Muriwai, and pitched their tents all around us. The kids and I being snug inside our tent, the girls were oblivious to our presence and there were, shall we say, some misplaced assumptions of auditory privacy. I probably should have bellowed something in my gruffest available voice, but at the time I wasn't sure what was appropriate. It was one of those weird social situations they don't tell you about in the manual.

View of White's Beach from the track down from Anawhata Road

Tasman Sea from Anawhata Road at sunset

Friday, 2 April 2010


Bas already wrote a good introduction to our new layers framework for cross-platform, GPU-assisted, retained-mode rendering. I've been meaning to blog about it myself, since it's what I've mostly been working on for the last few months, but to be honest I held back since I wasn't 100% sure whether, how or when it was all going to work. Now that fullscreen video playback is GPU-accelerated on trunk (currently only on Windows with GL), and I've made considerable progress in other areas of the work, I'm a bit more confident :-).

As Bas explained, the goal is to use the GPU for fast compositing and other effects. We also need to enable compositing on a thread other than the "main thread" where most browser operations (such as Javascript) run, so that animations and video playback can run smoothly even when the main thread is blocked on other work for significant periods of time. (Frameworks such as Core Animation --- which Webkit uses on Mac --- do this, and it rocks.) Off-main-thread compositing means we need some kind of 2D scene graph that's accessible off the main thread. To be GPU-friendly it behoves us to make the elements of this graph as large and as few in number as possible, e.g., they shouldn't be as fine-grained as individual CSS boxes. Like Core Animation, we call the nodes of this graph --- which is a tree --- layers. (Not to be confused with Netscape 4's layers!)

We've proceeded incrementally. First we defined a fairly simple API with only two kinds of layers: ThebesLayers and ContainerLayers. ThebesLayers are leaves representing one or more display items that will be rendered using Thebes (our wrapper around cairo). A ThebesLayer might be implemented using a buffer in VRAM that those display items are rendered into. ContainerLayers don't draw anything on their own but group together a set of child layers. Any layer can have various effects applied to it --- currently rectangular clipping, fractional opacity, and/or a transform matrix.

Our layers API was carefully designed so that although it admits a retained-mode implementation with off-main-thread compositing, you can actually implement it in immediate mode with main-thread compositing. For the first layers landing, we created a BasicLayers that does just that. Every time we paint, we construct a new layer tree and BasicLayers traverses the layer tree to render every layer into the destination context using cairo. This isn't any faster than our old code; we end up doing almost exactly the same cairo calls the old code did, but it was a good way to get started. BasicLayers will remain useful in the long term because it can render into any cairo context; for example we can print layer trees. I want to avoid having "layers" vs "non-layers" rendering paths in our code.

We construct a layer tree by walking a frame display list and building layers for the display list items. Groups of display list items that don't need their own layers and are consecutive in Z-order are assigned to a single ThebesLayer. A lot of Web pages collapse into a single ThebesLayer.

One interesting property of this approach is that we only create layers for visible content. This is very important since we need to conserve VRAM and keep layers overhead to a minimum.

The next step towards accelerated video rendering was to rework video to use layers for rendering instead of drawing pixmaps with cairo. We created a new layer type, ImageLayer, to render pixel buffers in various formats, in particular the planar YCbCr data emitted by our Theora decoder (and many other decoders). Our video decoder runs on its own thread and we want to eventually be able to play video frames without rendezvousing with the main thread, so we need to update the current frame of an ImageLayer on non-main threads. However for simplicity we only allow the layer tree to be updated by the main thread. We solved this by introducing a thread-safe ImageContainer object. The main thread creates an ImageContainer and an ImageLayer which references the ImageContainer. When it's time to display a new video frame, the decoder thread creates a YCbCrImage and makes it the current image of the ImageContainer. Whenever an ImageLayer is composited into the scene, we use the ImageContainer's current Image. Images are immutable so they're easy to use safely across threads. Internally, the BasicLayers implementation of YCbCrImage continues to perform the same CPU-based colorspace conversion that we were doing before. That conversion happens on the decoder thread when the Image is set in the ImageContainer. One other change here was the transform which letterboxes video frames into the <video> element was moved from being a cairo operation to being a transform on the video layer (which BasicLayers implements using cairo!). Once again, there was no real behaviour change, just a refactoring of our internal structures.

In parallel with all this, Bas Schouten was working on a real GPU-based implementation of the layers APIs. First he tackled D3D10, then GL. The contents of ThebesSurfaces are rendered by the CPU (or maybe not if we're using D2D on Windows) and uploaded to D3D or GL buffers in VRAM. They can then be composited together by the GPU. YCbCrImages are uploaded as texture data and then during composition they're converted to RGB by the GPU, for a performance win. The letterboxing transform is also applied by the GPU, for a big performance win.

That GL code has landed on trunk. Right now there are some serious limitations which I've glossed over and will talk about in a future post, so we can't just enable GL for all browser rendering. However, we've created a way for toplevel chrome windows to opt-in to accelerated rendering, and we've applied this to the window created for full-screen video. Right now this only works on Windows, but soon someone will get it working on Mac and X. It's ironic since our long-term GPU-acceleration solution for Windows is D3D, not GL (due to driver issues), but that happens to be Bas' favourite platform :-).

This post is already too long, but before I go I want to mention a few approaches we didn't take. One approach to using the GPU would be to just use a GPU-based backend for cairo, like cairo-D2D and cairo-gl. We like those backends and we do plan to use them, but cairo is fundamentally an immediate-mode API and I explained above why some kind of 2D scene graph API is essential. Another obvious approach would have been to adopt an existing retained-mode, GPU-accelerated scene library such as Clutter. I actually went to Emanuele Bassi's Clutter talk at LCA in January and talked to him afterwards. Clutter is not suitable for our use for two main reasons: its API is totally GL-based and reworking it for D3D or something else would really suck, and Clutter does its compositing on the main thread (the same thread as the GLib event loop), and this would be very hard to change due to existing Clutter API.

Another approach would have been to follow Webkit and abstract over platform scene frameworks --- e.g. using Core Animation on Mac and maybe Clutter on GTK. The problem is those frameworks don't exist where we need them; the only comparable thing on Windows that I know of is WPF's milcore, which is unusable by third party apps (and isn't on WinXP). I just mentioned how Clutter's threading model isn't what we want. Even Core Animation on Mac isn't as capable as we want; for example it can't currently support SVG filters directly. If you own the platform you can perhaps assume these frameworks will evolve to fit your needs; we don't.

There's a lot more to talk about --- cross-process layers, what's currently being worked on, how we're going to overcome current limitations like the fact that we currently rebuild the layer tree on every paint. I'll try to blog again soon.