Tuesday, 15 December 2009
Why in LAX terminal 4 (at least) have the powers that be decided to forbid any standard power outlets and replace them with "charging stations" that are always occupied and that you can't work at?
Up wasn't great. Is Pixar going downhill?
In District 9, why don't the aliens use their super-weapons to rise up against their oppressors? They don't seem to have a problem using them when the plot requires it.
How many Mozillians ordered machines for record-and-replay last week?
Why isn't alligator wrestling offered as an extreme activity?
Has Eric Schmidt lost his mind, or was that just Asa and the EFF?
Why can't men like Bill Clinton and Tiger Woods keep their pants on even when there is so much at stake?
If we make the repainting of scrolled-into-view content asynchronous instead of synchronous, will anybody notice on Windows?
Saturday, 14 November 2009
Today for the first time I seriously used VMWare's record-and-replay feature for debugging. Chris Pearce had set up the mochitest harness to run test_access_controls.html over and over until it failed, to try to catch the "random orange" where the test has been intermittently failing on Tinderbox. He caught a failure after about 30 minutes of continuous test running in a recording VM a couple of days ago. (Each test run takes about 15 seconds, so this represents over a hundred test runs before we reproduced the failure.) Today I got around to debugging it.
Chris had captured an output log and had added some code to nsGlobalWindow::Dump to number the dump() messages. Then we can set a conditional breakpoint in nsGlobalWindow::Dump to stop whenever a particular message is about to be printed. A sensible thing to do is to break when we output the last "TEST PASSED" message before the first test failure. It took almost an hour to replay up to that point (replay is a bit slower than recording), so another sensible thing to do is to then immediately take a VM snapshot so future debugging can resume from that point relatively quickly as often as you like. It takes about two minutes to resume replay from a snapshot.
From then on it was fairly similar to a normal debugging session: setting breakpoints, looking at variables, rerunning the program as you work backwards towards the cause. Except that you don't have to worry about the execution being different and the bug not showing up. You know exactly what the log output is going to be, right down to the addresses of objects dumped in the log, so it's easy to set conditional breakpoints to stop on log events you want to look into.
The whole experience is pretty much what I expected: fabulous! It would be nice if the couple of minutes to restart execution from a snapshot could be reduced, and some of the other UI operations feel sluggish and heavyweight, but this was certainly by far the best way to debug what turned out to be a complex and very hard to reproduce failure. It's a good feeling to know that whatever happens, you will be able to go back over this execution again; it takes the fear out of debugging, and replaces it with confidence that you *will* be able to find the bug. My hat's off to E Lewis and the rest of the VMWare team.
We need to do a bit more work to optimize and document our setup, get some more experience with these features and perhaps get some more automation for running tests, catching failures and building that vital initial snapshot. But I'm pretty confident that soon this will be an essential tool for us.
Wednesday, 4 November 2009
We landed support for a form of CSS gradients on trunk a while ago, but we got considerable feedback that our syntax --- which was an incremental improvement of Webkit's syntax, which basically exposes a standard gradient API in the most direct possible way --- sucked. A bunch of people on www-style got talking and Tab Atkins produced a much better proposal. Since we haven't shipped our syntax anywhere yet, dropping it and implementing Tab's syntax instead was a no-brainer. So Zack Weinberg, David Baron and I did that (using a -moz prefix of course), and today it landed on trunk. It should land on the Firefox 3.6 branch shortly. It's unfortunate to land something new like this after the last beta, but in this case, it seems like the right thing to do instead of shipping CSS gradient syntax that we know nobody wants.
This does mean that anyone who's currently using -moz-linear-gradient or -moz-radial-gradient on pages is going to find that their syntax doesn't work anymore. Hopefully that's not too many people yet.
Tuesday, 3 November 2009
One of the greatest errors I see in computer science research is a tendency to view research as a spectrum with "applied" at one end and "basic" at the other, with attached assumptions that "applied" is incremental, boring, and engineering-oriented, but "basic" is crisp, radical, and intellectually satisfying. This is a false dichotomy, and I like to debunk false dichotomies.
I think the most principled way to construct such a spectrum is to classify research projects according to the degree of simplifying assumption they make to reach a crisp problem statement. So "applied" research adopts most of the relevant "real world constraints" into its problem statements, and "basic" research throws most of them out. The error is to assume that "real world constraints" make a problem boring, intellectually unsatisfying, and non-conducive to finding real breakthroughs . This may be true "on average", but when choosing a research project one is not constrained by averages. In my experience, interesting research topics are abundant, and there are problems whose solution can be immediately applicable to the world while also being intellectually satisfying and potentially breakthrough science. (The current frenzy over combined symbolic and concrete execution is a good example.)
Let me suggest several such problems in the area of software development technology that I feel aren't yet getting the attention they deserve.
We know that verifying real software against full specifications is prohibitively expensive in most cases. However, I hypothesize that one could cheaply verify most changes that land in mozilla-central. The secret is that most patches are refactorings --- changes that should not alter observable behaviour --- or can be split into refactorings plus a smaller change that actually alters behaviour. "Should not alter observable behaviour" is very simple to state and understand, but still very tight. It would be huge win if we could prove that most such patches meet that specification. It's unclear how difficult this would be in general, but clearly patches that are the output of refactoring tools should be tractable to verify, and there is a lot of unexplored territory beyond that.
Improved Record And Replay Debugging
I don't have time to work on Chronomancer, but someone should be working out how to expose omniscience to the programmer to optimize the debugging experience.
Testing Fixes For Non-Reproducible Bugs
Record and replay technology is more or less here. (I'll post more about VMWare's solution later, but suffice to say it's working quite well so far!) Suppose you have recording of a bug which isn't reproducible. Now you can replay that recording one or more times and understand the bug and write a patch that should fix it. The problem you immediately run into is: how do you check that the patch really fixes the bug? You somehow need to inject the fix into the code and replay; that's tricky, and the fix may disturb the replay and cause the bug not to occur on that run for reasons unrelated to the bug actually being fixed.
Performance Measurement And Variation
Measuring performance is hard. One huge problem is that you want to measure effects that are much smaller than the random noise in the tests. I don't really know where this noise comes from. I would love to see research that analyzes the sources of run-to-run performance variation and describes techniques for reducing the variation.
I believe all these problems are reasonably easy to get into, have been little investigated in the past, have lots of room for exploration by multiple groups, are intellectually stimulating, and are likely amenable to solutions that could be immediately applied to the needs of software developers.
 The converse error --- assuming that "basic" research must not be boring or incremental --- is also widely held, but more easily falsified. Consider the endless stream of esoteric type theories that are both boring and completely incremental, such as my POPL '99 paper!
Saturday, 17 October 2009
Yesterday I checked in a patch that removes support for the 'load' event on <video> and <audio> elements. We simply never fire it. Also, the
networkState attribute is now never
NETWORK_LOADED. When we've read to the end of the media resource,
networkState changes to
NETWORK_IDLE. We plan to ship this change for Firefox 3.6.
There are several reasons we're doing this, even though it will break some Web content. One reason is that the spec says that 'load' should only fire when/if we guarantee that we will never again hit the network for data for that media resource. We never guarantee this, because our cache may always discard data. For example if you play through a large media file that doesn't fit in the cache, we'll read from the network again if you seek back to the beginning. If you play through a resource that does fit in the cache, and then play another large resource we might evict blocks from the first resource to make room for the second resource. So we never followed the spec here, and if we do follow the spec then we can never fire the 'load' event.
Another reason is that 'load' is a dangerous and fairly useless event. Web authors expect that every media element will eventually receive a 'load' event, because documents and images do, right? But that expectation is in vain, in many situations a 'load' event will never be fired. Furthermore authors cannot predict those situations because they depend on client cache size and policy. It is quite likely that the author will always get 'load' events during testing but some clients won't get a 'load' event and the site will be broken for those clients. So since the 'load' event won't ever fire in some browsers, and in other browsers will fail to fire in unpredictable situations, authors should never use it. In practice we see authors inappropriately using 'load' when they should be using another event like 'canplaythrough' or 'suspend'.
In fact, this argument is persuasive enough that 'load' for media elements and the NETWORK_LOADED state have been removed from the spec. There was consensus from Mozilla, Google, Apple and Opera developers that this is the right thing to do. I expect that other browsers will be following suit soon; in fact, apparently Chrome has never fired a 'load' event on media elements.
If you have been using 'load' on your <video> or <audio> elements, you should use a different event. There are several to choose from depending on what you actually want:
- If you want to do something when the video metadata is available (e.g. duration or size), use 'loadedmetadata'.
- If you want to do something when the first frame of video can be displayed, use 'loadeddata'.
- If you want to do something when the browser has stopped downloading the resource, use 'suspend'. Note however that the browser might stop before reaching the end of the resource, e.g. if its cache fills up, or the element doesn't have 'autobuffer' set.
- If you want to do something when there's enough data loaded to start playing, use 'canplay'.
- If you want to do something when there's enough data loaded to start playing and playback should be able to make it to the end without stalling on the network, use 'canplaythrough'. Note however that 'canplaythrough' might not fire because the browser's cache fills up, in which case you'll get a 'suspend'.
- If you want to do something when the video has finished playing, use 'ended'.
Monday, 5 October 2009
We had a great time visiting relatives in Melbourne. During the week we took a three-day road trip along the Great Ocean Road southwest to the Twelve Apostles rock formations, then north to Hall's Gap in the Grampian mountains before returning to Melbourne. We did a lot of short walks --- at Anglesea, Marriner's Lookout, Maits Rest, the Cape Otway lighthouse, the Gibson Steps, the Loch Ard gorge, Chautauqua Peak, and McKenzie Falls. Along the way we saw wild koalas, kangaroos and wallabies. The Great Ocean Road reminded me of Big Sur. The stories of shipwrecks, settlers, hardship and survival were inspiring. A great trip.
One tip: a brochure claimed the return trip to the base of McKenzie Falls takes one hour and twenty minutes, so I almost skipped it, but we did it in twenty-five minutes, with two five-year-old kids. It's well worth it!
Saturday, 26 September 2009
Friday, 25 September 2009
Lately I've been thinking about Luke 12 which says in part
Then Jesus said to his disciples: "Therefore I tell you, do not worry about your life, what you will eat; or about your body, what you will wear. Life is more than food, and the body more than clothes. Consider the ravens: They do not sow or reap, they have no storeroom or barn; yet God feeds them. And how much more valuable you are than birds! Who of you by worrying can add a single hour to his life? Since you cannot do this very little thing, why do you worry about the rest?
"Consider how the lilies grow. They do not labor or spin. Yet I tell you, not even Solomon in all his splendor was dressed like one of these. If that is how God clothes the grass of the field, which is here today, and tomorrow is thrown into the fire, how much more will he clothe you, O you of little faith! And do not set your heart on what you will eat or drink; do not worry about it. For the pagan world runs after all such things, and your Father knows that you need them. But seek his kingdom, and these things will be given to you as well.
Does this advise us not to think, plan or work hard? Surely not, since so many other parts of the Bible (including words of Jesus himself) tell us to do those things. So I suppose the important thing here is primarily to get our priorities straight, and then to work hard but not worry for a moment about the results.
This is difficult for me because there are many things I care deeply about, including my work, my family, and my friends, so I worry about them. I don't think it's wrong to care about these things; Jesus certainly cared deeply about many people. Apathy is an easy alternative to worry but I don't think it's what God wants. This is a very practical issue for me because at times over the last year or two I've not been sleeping as well as usual because of work-related stress.
I think what God intends is for me to care deeply but trust him even more deeply, for me to do all I can and leave all results entirely in his hands, to be always joyful knowing that he is the ultimate benefactor.
Do not be anxious about anything, but in everything, by prayer and petition, with thanksgiving, present your requests to God. And the peace of God, which transcends all understanding, will guard your hearts and your minds in Christ Jesus.
This is hard for me to do, it's a constant spiritual exercise, but by grace I've had some success lately. I'll need to keep at it.
Wednesday, 23 September 2009
Tuesday, 22 September 2009
v1.mozLoadFrom(v2) behaves much like the
load method of
HTMLMediaElement. It stops playing the currently loaded resource and loads a new resource. However, it does not run the HTML5 media selection algorithm; instead it just takes
currentSrc and loads that. What makes
mozLoadFrom useful is that it tries to avoid re-downloading the resource data. It grabs all the cached and buffered data already associated with
v2 and makes it available to
v1 as well. In fact these elements carry on sharing data, so that any future data downloaded by
v2 is available to the other. Typically this means that only one of the elements will actually be downloading data and the other one won't need to download anything at all. (Although if the entire resource doesn't fit in the media cache, you can end up in situations where the elements are playing different points in the stream and both reading from different offsets.)
mozLoadFrom is useful if you need to manipulate multiple views of the same video/audio stream. For example you might be playing a video and at the same time you want a script to seek forwards or backwards and gather periodic snapshots into a series of <canvas>es. Or you might have a set of videos playing in the page at a smallish size and want to display a selected one in the front covering most of the window, without having to mess with the original thumbnail video. The main reason I implemented
mozLoadFrom is to make it easier to display full-screen video in Firefox or extensions.
The tricky part of the implementation is that the media cache was originally designed so that a block in the cache could only belong to one decoder at a time. With
mozLoadFrom a block can be owned by multiple decoders at a time. In the original design, a block is always in exactly one of four states: free, readahead (it's ahead of the current playback point), played (it's behind the current playback point), and metadata. Now a block can be a readahead block for one decoder and a played block for another decoder.
I was quite worried it would be hard to extend the design this way, but in fact I was able to do it in about a day of intense hacking. This probably reflects well on the original design of the media cache. In particular, basing eviction decisions on the "predicted time of next use" was easy to extend; when a block has multiple owners, its predicted time of next use is the earliest time that any of its owners predicts for it.
Update Ian Hickson points out that new API is not needed here since HTML5 allows concurrent loads of the same absolute URI to share data regardless of HTTP caching instructions. So instead of
mozLoadFrom we could just use
v1.src = v2.currentSrc to get the same effect. I should do that...
Sunday, 20 September 2009
On Thursday night the Auckland Web Meetup was held at the Media Design School. The topic was "HTML5" and the presenters were Microsoft's "IE8 Technical Evangelist" Giorgio Sardo and me and Chris Double from Mozilla. There were about 200 people in the audience, mostly Web developers.
Giorgio presented first and gave a broad "IE8 for Web developers" presentation. He showed a funny (but somewhat pointless) promo video for IE8, went through all the Web-facing platform improvements in IE8 (with demos and showing code in Visual Studio), and demoed their Firebug clone. His basic message was "we care about standards (including HTML5), we're doing it, our release cycles are slow because we have the most users but we will get there." What I thought was very interesting was that he made no attempt to distance themselves from HTML5 or even say that they'd be selective about which HTML5 features made sense to implement. His message tacitly assumed that HTML5 is simply something they will do. He portrayed Microsoft as part of the community contributing the development of HTML5, which isn't really true but at least Adrian Bateman's sent a few emails to public-html recently. I was really curious about what Microsoft says about HTML5 to Web developers so I was glad to have the chance to see this presentation.
One minor thing I should mention: Giorgio demoed a new Microsoft Expression tool that lets you compare the rendering of your Web page in different browsers. It's pretty cool, for example you can hover over an element in one browser view and it will highlight the same element in the other browser's view. Currently it supports Firefox and IE6/7/8. However it only supports static rendering, you can't interact with the page and script doesn't run. Also, for some reason it doesn't actually use the IE6 and 7 engines to render the views! Instead it uses IE8's IE7-mode to render "IE7", and --- somewhat bizarrely --- they *re-implemented* an "IE6-alike" layout engine to approximate what IE6 does. Those seem like strange technical decisions that will limit the usefulness of the tool, since Web developers will probably still need to test in real IE6 and IE7 builds.
My presentation was a short history of WHATWG and HTML5, a discussion of the HTML5 goals and why they're controversial in some quarters, a discussion of how HTML5 will affect Web developers, and a walk-through of major HTML5 features (including related features such as Workers and Sockets) that we have in Firefox or will soon have. I mentioned the database controversy to try to communicate that that area is not settled. I showed a bunch of demos which were very well received. Chris Double talked about <video> and showed some demos (mostly Paul Rouget's) which were also very well received. I've uploaded my slides. Here are the demos (they should all work in latest Firefox trunk builds):
- Michael Ventnor's accelerometer-enhanced "Mario Kart" based on Hans Schmucker's version using SVG filters
- Geolocation in Google Maps (OK, this one and the previous one aren't actually HTML5, but I wanted to make the point that HTML5 isn't the only channel bringing Web platform improvements.)
- Austin Chau's postMessage demo
- Leslie Orchard's drag-and-drop demo
- A demo I wrote of HTML5 parsing SVG and MathML without XML (need to set "html5.enabled" in about:config to true)
- 2D <canvas> demo "BlobSallad"
- A couple of WebGL (i.e. 3D <canvas>) demos that I'm not sure I can redistribute. But they're awesome!
- Paul Rouget's "Green Screen" <video> demo
- Paul Rouget's movement tracker <video> demo
- Paul Rouget's content injection <video> demo
At the end there was a panel discussion. We spent a lot of the time on the video codec issue. Chris did a great job of explaining that people shouldn't have to pay license fees to serve or consume video on the Web --- he actually got widespread spontaneous applause. The Web developer audience may not start using Theora tomorrow but they understand and sympathize with why we're doing it.
Overall I really enjoyed the evening and I think it was fairly successful at explaining and promoting HTML5. I'm encouraged that Microsoft seems to be endorsing it, and I hope that we see a massive implementation effort in IE9.
Tuesday, 8 September 2009
A lot of people have the idea that a successful "knowledge economy" requires much cheaper and faster broadband than what we have available in NZ today. I strongly disagree. If that was so, then Japan, South Korea and China would be dominating the United States in software development and Internet services, but they are not, or at least their strength is in volume, not innovation.
This Herald blogger suggests that if only we had faster, cheaper broadband, users would be figuring out amazing things to do with it. I don't understand this argument at all. Users can't do anything with broadband except use services provided by software developers. NZ's network is adequate enough to build and deploy any service developers can think of, and if you want to reach a mass market you can trivially host your services in the USA or elsewhere. (Chris Double in our office runs TinyVid, hosted in the USA, as a hobby, for goodness sake.)
Major research universities in the USA often presented similar arguments to large funding agencies: give us a lot of money to build amazing infrastructure (networks, computers, etc) and cool new ideas will be spontaneously generated. I don't think that was ever really true; the great ideas seem to come from getting smart people together and giving them freedom to work on their interests with adequate supporting infrastructure to experiment on; I can't think of any examples of great ideas that were inspired by a superabundance of infrastructure. Sure, if you had a megabit network and no-one else did, or Unix workstations when everyone else had DOS PCs, you had an edge, but by the mid-90s when I entered grad school at CMU commodity PCs and networks were powerful enough to support cutting-edge research in most areas of computer science. That didn't stop the universities asking for infrastructure money though!
Perhaps I could argue that making it easier to consume software and services is actually a long-term drag on the "knowledge economy". I got into programming because my first computer had almost no software on it and no way to get any more --- so all I could do with it was write my own. How many bright kids are hooked on Facebook, Youtube and game consoles when they could be coding their own dreams into reality?
Based on my experiences I certainly have gripes about problems dragging on NZ's "knowledge economy", but broadband isn't one of them.
Sunday, 23 August 2009
I'm back in Auckland for one night before I head off again to California. The Australia trip was great. We roamed around Sydney visiting friends and relatives, but had a couple of days in the Blue Mountains on our own. The weather was great throughout. I achieved complete Internet independence for a week, no withdrawal symptoms.
Like most of our trips there was lots of walking and lots of eating. The Blue Mountains were excellent --- great views, and really interesting geology and habitats, especially the way water trickles down the cliffs forming amazing "hanging swamps". We loved the Wentworth Falls area, especially the National Pass walk that descends from the top of the escarpment down through wet gullies and then along a shale ledge halfway up great Triassic sandstone cliffs, then up hundreds of steep steps beside the waterfall. The track's graded "hard", probably due to the staircases, and the nominal time is 3 to 4 hours, but we did it in 2:45 including lots of lookout stops ... not bad for my 4 year old son! We'd already walked 2.5 hours down the Giant Staircase and along Dardanelle Pass in Katoomba that morning. The following day we walked to the bottom of Govett's Leap falls and back up, also very nice, but a bit trickier due to wet rock surfaces, especially going down.
If I'd realized before booking how much of the appeal of the Blue Mountains rests on peering over precipices and descending impossibly steep stairs, I might have had second thoughts on account of my fear of heights. Since I had a great time anyway, I'm glad I didn't know!
Just before these walks I'd been reading The Watchmen (fabulous). I was struck by the contrast between that bleak vision of fallen humanity and the glorious beauty of God's creation. It's good to look away from man for a while.
- JJ Abrams' Star Trek: pretty good
- X-Men Origins: Wolverine: better than expected, which isn't saying much
- Shaolin Temple (1982): Jet Li's first major role. Otherwise not very interesting except that there are some astounding apparently real action sequences, where for example a man does handsprings without hands, just flipping over onto his head and then back up again.
- Roman Holiday: Even better than I remembered, definitely a must-see
- Anna and the King: How can you put two of my favourite actors, Jodie Foster and Chow Yun-Fat, in starring roles and make such a terrible movie? I don't know, but it's not worth watching to try to find out. Upsetting.
Friday, 14 August 2009
From mid-tomorrow until August 22 I'm going to be on holiday in Australia. I'm not bringing my laptop --- I'll be completely offline --- so don't expect me to reply to anything.
From August 23 to 28 I'll be in California for Mozilla meetings. I'll be online during much of that time.
Monday, 10 August 2009
Seems like a few people got close to the answer for the first problem. You formulate the condition for q being on the circle as a quadratic equation in t, and then apply the quadratic formula to solve for t. However, I can't give full credit to anyone because all the answers were incomplete; many ended containing a "plus or minus the square root ..." but a complete answer must describe which solution to pick. It's pretty easy to figure that out though.
My answer is
t = (ax + by - √(r²(x² + y²) - (ay - bx)²))/(a² + b² - r²)
where p = (a,b) and q = (x,y). The answer from Wolfram Alpha via Neil Rashbrooke is the same except it chooses the positive root; I think that's wrong, because it will give you a negative t (the denominator is negative).
The second problem is harder. Even though it's intuitively "obvious" that a solution is always possible, I found it quite difficult to prove, and apparently so did my readers. I think Joshua Cranmer got close but I don't fully understand his answer and he didn't get a complete algorithm for finding the order.
I start by observing that if you can find one rectangle to move safely then the solution to the whole problem follows by induction, since one moved rectangle can never overlap other moved rectangles, since they didn't overlap to start with. So the problem is reduced to finding one rectangle to move safely from the set of rectangles.
I also observe that without loss of generality you can reduce the problem to the case where the components of v are non-negative, i.e., we are moving the rectangles down and to the right by some amount, possibly zero.
Let's say rectangles R and S overlap horizontally if their projections onto the x-axis intersect. Similarly, they overlap vertically if their projections onto the y-axis intersect. Because our rectangles are non-overlapping, if R and S are different and overlap horizontally they cannot overlap vertically, and either R is entirely above S (every point in R is above (or on) the top edge of S) or R is entirely below S (every point in R is below (or on) the bottom edge of S).
Consider the subset of rectangles R such that R has no rectangle entirely below it that it overlaps horizontally. These rectangles do not overlap each other horizontally. Choose the "rightmost" rectangle from this set, let's call it M. I claim no other rectangle intersects M + v. This solves the problem.
The proof is still subtle. We have to show that our subset is non-empty, i.e., that at least one rectangle has no rectangles below it that it overlaps horizontally. This is easy though, since rectangle with the lowest bottom-edge certainly has no rectangles below it. So such an M exists.
Of course the hard part is to show that no other rectangle intersects M + v. We can show that in fact no rectangles are below M's top edge and to the right of M's left edge (other than M itself). Let's call this set S. I'll show S is empty, by assuming S is nonempty and working to a contradiction.
Clearly there can be no rectangles below M's top edge that overlap M horizontally, or they would have to be below M and we wouldn't have put M in that subset. So every rectangle in S would have to be entirely to the right of M. So choose out S one rectangle that has the lowest bottom edge rectangles in S. Let's call it P. I will show that P has no rectangles entirely below it that it overlaps horizontally, so it should have been part of the initial subset, which contradicts our choice of M as the rightmost rectangle in that subset, thus showing that no such P exists and S is empty.
Clearly no rectangle in S is below P and overlaps P horizontally. What about the rectangles not in S? Suppose a rectangle Q is not in S, is entirely below P and overlaps P horizontally. The left edge of Q must be at or to the left of the left edge of M, otherwise Q would be in S. The right edge of Q must be to the right of the right edge of M, since Q overlaps P horizontally. So Q also overlaps M horizontally. The bottom edge of Q is below the bottom edge of P which is below the top edge of M, so Q must be entirely below M --- but this is impossible, since we chose M to have no rectangle overlapping horizontally below it.
Phew! That's not easy for something so "obvious".
This solution should make for a reasonably efficient algorithm.
- Sort the rectangles by right edge.
- Set R to the rightmost rectangle.
- If R has any rectangles overlapping horizontally below it, set R to the one of those rectangles with the rightmost right edge and go to step 3.
- Move R.
- Remove R from our list.
- If there are more rectangles in the list, go to step 2.
Update Actually I'm pretty sure you can do a direct proof of correctness of the above algorithm. At the beginning of step 3 you have the invariant that there are no rectangles to the right of the right edge of R and below the top edge of R. Using that invariant, I think you can construct a much simpler proof than the one I gave here.
Also, as at least one commenter observed, you can improve this algorithm by not throwing away all your work every time you go to step 2. That adds complexity though...
Update Fixed a typo in the answer to the first problem, pointed out by Boris.
On Saturday our family went out to the west coast with Mozilla intern Markus Stange and another friend, to enjoy a fine winter's day, get some exercise, and show off some of the spectacular scenery of the Waitakere coast.
We started at the end of Te Ahuahu Road just south of Piha. The Mercer Bay Track started with instant gratification of views over Piha and the Tasman Sea and then took us south down the hill to a lookout over towering cliffs, where we had lunch. We carried on south to the Ahu Ahu Track and then Comans Track, which winds south and downhill to Karekare, with great views north to Mercer Bay and south over Karekare Beach to Whatipu along the way. The tracks were muddy in places but otherwise in good shape.
At Karekare we walked on the beach for a while and checked out the rocks and pools at the north end of the beach. Markus distinguished himself with some inspiring rock climbing. There were some good waves rolling in, and an easterly wind was blowing the tops off the waves --- spectacular.
My wife had brought the car down, so we drove back to Piha and headed to the end of Glen Esk Road for the track to Kitekite Falls. It's an easy gravel track beside the Kitekite Stream, very relaxing. The return leg has a couple of easy stream crossings. The Falls always look great.
Then we headed back to Piha beach and wandered to the southern end. Markus and I took a quick walk further south along the Tasman Lookout track and found a way down the hill to the beach behind the small Taitomo Island, which is connected to the mainland by sand which is only covered at high tide. The way down the rock face wasn't easy to find though! We stood near "The Gap" in the island watching waves crash through the narrow passage between two parts of the island; we were surprised by a particularly big surge which flooded the area all around us, leaving us on an isolated dry spot! No wonder so many rock fishing people drown. We got away dry and walked back to Piha beach around the rocks.
After all that we had one more walk: up Lion Rock in the middle of the beach. It's steep, but easy, and a great view. Too bad you can only go halfway up these days due to instability at the summit. The sun set behind clouds over the ocean.
Altogether we were in the area from about 11:30am to 5:30pm --- a superb day. (I solved homework problem #2 somewhere along the way, too!) My kids were still running around at the end, so I'm going to have to upgrade the difficulty of these trips!
We finished the day by driving back to the city and having dinner at Sunshine. As far as I know it's still the best Chinese restaurant in Auckland, as good as I've had anywhere in Australasia and North America. And though it's not the cheapest place in town, it's not really expensive either; last night's dinner was $84 for the set menu for six.
All in all I have to thank God for letting me live here :-).
Saturday, 8 August 2009
Two interesting problems I've had to solve this week...
Given a point p and a radius r, consider the family of circles Ct, where t is a positive real number, with centres tp and radii tr. Given a point q, find the smallest value of t such that the circle Ct contains q. You may assume |p| < r. This came up while I was optimizing repeating radial gradients for Mac.
Given a set of non-overlapping axis-aligned rectangles in the plane, and a 2D vector v, order the rectangles in a sequence so that translating each rectangle one by one by v does not have any overlapping rectangles in the intermediate states. Does such an order always exist? This came up while I worked on optimizing scrolling.
Tuesday, 28 July 2009
The transition to robotic warfare scares me a lot. Soon, war will *only* kill civilians. Making war cheap for the aggressor can't be good. Most troublesome of all is the obvious problem that I rarely see discussed: computers, software and people being what they are, it's inevitable that drone command and control will be compromised by enemy agents. It will be very exciting when the US army turns around and flies back to flatten Washington, and no-one has any idea who did it. People think drones accentuate US military superiority, but it's not true; they neutralize it. This is going to be a tragic example of common sense being trumped by overconfidence and dazzling technology.
The slogan products aren't for everyone, but there's definitely a place in our society for provocative humour that pushes the boundaries. Translation: if we think it's funny and it shocks people, then that in itself is a good thing --- no matter who gets hurt.
I've seen lots of T-shirts like that overseas that are a lot more out there. Translation: Because this isn't the worst possible example, it's OK.
You hear the same sort of irrational nonsense used to defend absurdly violent video games and movies, and anything else that degrades human beings for profit.
Wednesday, 15 July 2009
I hate to link and run, but I have to share this with my friends who aren't on Mozilla internal mailing lists: State Department workers beg Hillary Clinton for Firefox.
"It was approved for the entire intelligence community, so I don’t understand why State can’t use it."
I need a T-shirt that says "Approved for the intelligence community".
Tuesday, 14 July 2009
About ten years ago I happened to have lunch with Eric Schmidt. The CMU board of trustees was visiting the School of Computer Science and they had lunch with some of the CS grad students. Our table was just Mr Schmidt, me, and a couple of other students. At the time he was CEO of Novell. I was feeling cantankerous and took the opportunity to ask Mr Schmidt how he managed when the imperative to maximise shareholder returns required unethical (but legal) behaviour. His answer was that this never occurs. He suggested that ethical behaviour always, in the long run, maximises returns.
I thought (and still think) this was grossly wishful thinking. It would require extraordinary luck or divine providence, and even for a theist like me I think it's clear that God has not always arranged for profit and ethics to be perfectly aligned (end sarcasm). A company truly focused on shareholder returns will sometimes, probably often, be obliged take advantage of unethical opportunities. Fortunately, Mr Schmidt's wishful thinking seems very common. People want to do the right thing, and convince themselves and each other that it's going to be best for the company too. They cheat the shareholders and do the rest of humanity a favour. Bravo!
I saw a lot more of this at IBM Research and (from outside) other labs --- publishing corporate research labs depend on corporate vanity mixed with altruism, and a desperate wish to believe against all evidence that they're an acceptable return on investment. It's a relief to work for a non-profit where we can be honest about such things.
Of course back at that lunch at CMU I had no idea that Mr Schmidt would go on to become CEO of the world's newest corporate superpower. Keep the faith, Mr Schmidt, and may the scales never fall from your eyes!
Saturday, 4 July 2009
There's a lot of press these days about the HTML5 <video> tag and the struggle for universal unencumbered video and audio codecs --- much of it associated with the Firefox 3.5 launch. I wonder how many people know that the Firefox video implementation is almost entirely due to just a few people in the Mozilla office in Newmarket --- Chris Double, Matthew Gregan, Chris Pearce, and to a lesser extent, me. (Justin Dolske did the controls UI, but I'm not sure where he lives!) I'm proud that we managed to get considerably more done for the 3.5 release than I expected.
It's a great privilege to have the opportunity to really shake up the world for the better, with a very small team, in a relatively small amount of time, and do it right here in Auckland. I'm very thankful.
Thursday, 2 July 2009
I've submitted all of my compositor phase 1 patches for review. The patches are also published here. 111 files changed, 2573 insertions, 1448 deletions, divided into 39 separate patches over multiple Bugzilla bugs. In theory every one of those steps should build and pass tests, although I haven't actually verified that for all the patches. I managed to break things up pretty well --- the largest patch is only 536 insertions and 75 deletions --- so hopefully that will make reviewing easier.
MQ has been working pretty well for me. I get into a routine of applying all the patches, doing some testing, fixing a number of bugs, and then redistributing the changes across the patches that they logically belong to. I'm not 100% sure this is the most efficient way to work --- sometimes I burn quite a bit of time putting all the changes in just the right places --- but at least it's now possible.
Now it's time to start working on something else. My immediate next task is to restructure the media tests so we can generalize tests across file types and backends; for example, right now we have one set of seeking tests for Ogg and another for Wave, but we should just have a single set of tests parametrized by test files of different types.
After that, I plan to do some cleanup that's enabled by compositor phase 1. In particular, we can move scrolling out of the view system and integrate it all directly into the scrollframes in layout.
After that I plan to work on compositor phase 2. Right now in Gecko whenever something needs to be repainted we make platform-level invalidation requests, the platforrm dispatches paint events and we paint. This often leads to over-frequent painting. For example if there's a script changing the DOM 100 times a second, we'll try to paint 100 times a second if we can keep up, which is a waste of time since most screens only refresh at 60Hz or so. Even worse, if you have that script, and an animated image painting 20 times a second, and a video playing at 25 frames per second, we invalidate them all independently and if your computer is fast enough we'll paint 145 times a second. We need to fix this, and I have a plan.
Part of that plan is to create an internal animation API that various Gecko components (animated images, video, smooth scrolling, SMIL, CSS Transitions, etc) can plug into. But we also recognize that declarative animations will never be expressive enough for all use cases, and there's a lot of existing scripted animation libraries out there, so I have an idea for tying in scripted animations as well. Basically, I would expose the following API:
- window.mozRequestAnimationFrame(): Signals that an animation is in progress, and requests that the browser schedule a repaint of the window for the next animation frame, if the window is visible.
- The browser will fire a mozBeforePaint event at the window before we repaint it. Animation libraries should register an event handler that checks the current time, and updates the DOM/CSS state to show that point in the animation. If the animation has not ended, the event handler should call window.mozRequestAnimationFrame() again to ensure another frame will be scheduled in a timely manner.
That's it! That API gives the browser control over the frame rate, while allowing JS to do anything it wants in each frame.
Thursday, 25 June 2009
Recently I've been working removing the use of native widgets (Windows HWNDs, Cocoa NSViews, GdkWindows) in various places in Gecko. Native widgets get in the way of painting and event handling, and break things due to various platform limitations.
The key difficulty of getting rid of native widgets is ensuring we still handle windowed-mode plugins adequately. For example, currently if a windowed-mode plugin is contained in an overflow:auto element we rely on a native widget associated with the element to scroll and clip the plugin. In situations where we should clip but there is no native widget present, like 'clip' on an absolutely positioned element, we actually don't currently clip the plugin, which is pretty bad.
My new strategy for clipping plugins is to explicitly compute a clip region for each plugin and use native platform APIs to clip the plugin's widget to that region. On Windows we can use SetWindowRgn, on X we can use the XShape extension via gdk_window_shape_combine_region. On Mac we don't have to do either of these things, because plugins work a bit differently there. One issue of course is figuring out what the clip region should be. For that, I'm reusing our display list machinery. We build a display list for the region of the page that contains plugins, and then traverse the display list to locate each visible windowed-mode plugin --- its position, and what clipping is affecting it. Plugins that aren't in the display list aren't visible so we hide them. This works really well, and means that everywhere we clip Web content, plugins are automatically affected as well.
A nice bonus is that display lists contain information about which elements are opaque. We already have the ability to determine that part of an element is covered by opaque content and thus doesn't need to be drawn. We can leverage this for plugins too: areas of a plugin we know are covered by opaque content (e.g. a DIV with a solid background color) are clipped out. This means that all the crazy hacks people have been using to get content to appear above a windowed-mode plugin today (e.g., positioned overflow:auto elements and IFRAMEs) will no longer be needed with Gecko; you can just position a DIV or table with a solid background color. The hacks still work of course, as long as the hacked element has a solid background. (Credit where it's due: IIRC clipping holes for opaque Web content out of windowed plugins was first suggested by a KHTML developer long ago.)
This also fixes relative z-ordering of plugins with other plugins and Web content. We don't have to touch native widget z-ordering because windowed-mode plugins are opaque. So if plugin A is supposed to be drawn in front of plugin B, but its widget is behind plugin B's widget, then we effectively punch a hole in plugin B's widget so you can see plugin A through it. It sounds a little crazy but it works just fine.
Perhaps the greatest challenge here for us is getting scrolling to work smoothly on all platforms. In general we may have an IFRAME or overflow:auto element in the page, which contains one or more plugins which need to move smoothly with the scrolled content. A naive implementation would just bitblit the Web content and then move and clip the plugin widgets, creating a variety of exciting and unwanted visual artifacts. I designed a new cross-platform scrolling API which is quite rich; you pass in the rectangle you want to scroll, the amount you want to scroll it, and a list of commands to reconfigure child widgets that should be performed with the scroll "as atomically as possible". These commands let the caller change the position, size and clip region of some or all child widgets. This gives the platform-specific code great freedom to reorder operations. I described some of the implementation issues on X in a previous post.
I think I've got things working pretty well. My patches pass tests on all platforms on the try servers. There are 24 patches in my patch queue, many of them small preparatory cleanup patches. The big important patches enable region-based plugin clipping, modify a ton of code to remove the assumption that every document has a native widget as its root, actually remove the native widgets associated with content IFRAMEs, and remove the native widgets used for scrolling and implement the new scrolling system.
In order to not change too much at once, I've left some things undone. In particular "chrome documents" like the Firefox UI still use native widgets in quite a few places. There's a lot of followup simplifications that we can do, like eliminate the "MozDrawingArea" abstraction that we currently use to support guffaw scrolling. In fact, apart from fixing a ton of plugin and visual bugs, the main point of this work is to enable further changes.
For people who are interested, I've got some try-server builds available for testing. I'm particularly interested in IME and accessibility testing, since although the code passes tests I suspect external IME and AT code may be making assumptions about our native widget hierarchy that I'm breaking. General scrolling and plugin performance is also worth looking at.
I just landed on trunk a small improvement to page-load performance. Typical HTML documents contain a lot of markup like
In the DOM, there are four text nodes which contain only whitespace. Up until now, in Gecko we created one "text frame" to render each text node. However, this is a waste, since whitespace-only text nodes before or after a block boundary don't affect the rendering; they are collapsed away (unless CSS
white-space:pre or certain other unusual conditions apply).
So I did some measurements of our page-load-performance test suite (basically, versions of the front pages of popular Web sites) with an instrumented Gecko build to see how much could be saved if we avoid creating frames for those whitespace DOM nodes. It turns out that in those pages, 65% of all whitespace-only text frames were adjacent to a block boundary (i.e., the first child of a block, the last child of a block, or the next or previous sibling is not inline) and don't need a frame. That's 30% of all text frames, and 11% of all layout frames!
Actually suppressing the frames was pretty easy to implement --- it helped that Boris Zbarsky has recently done some magnificent hacking to make nsCSSFrameConstructor more sane. Handling dynamic changes correctly was a little tricky and actually took most of the development time. Our test suite, which is pretty huge now, caught some subtle bugs. But the patch is on trunk and seems to be a 1-2% improvement in average page load time. It's also a small reduction in memory usage. It also makes dumps of the layout frame tree much easier to read because a lot of pointless frames are gone.
What's probably more important than all that is that this makes it easier to separate our implementation of blocks into "blocks that contain only blocks" and "blocks (possibly anonymous) that contain only inlines" without regressing performance, because we won't have to create anonymous blocks just to contain irrelevant whitespace.
Tuesday, 23 June 2009
People keep asking why we don't integrate support for Windows' DirectShow into Firefox so the <video> element can play any media that the user has a DirectShow codec for. Even if a volunteer produces a patch, I would not want to ship it in Firefox in the near future; let me try to explain why.
- Probably most important: we want to focus our energy on promoting open unencumbered codecs at this time.
- Only a very small fraction of Windows users have a DirectShow codec for the most important encumbered codec, H.264. Windows 7 will be the first version of Windows to ship with H.264 by default. Even if millions of people have downloaded H.264 codecs themselves, that's a very small fraction of our users.
- DirectShow is underspecified and codecs are of highly variable quality. Many codecs probably will not work with Web sites that use all the rich APIs of <video>, and those bugs will be filed against us. We probably will not be able to fix them. (Note that the problem is bad enough that in Windows 7, Microsoft isn't even going to allow unknown third parties to install DirectShow codecs.) We could avoid some of this problem by white-listing codecs, but then a lot of the people who want DirectShow support wouldn't be satisfied.
- Many DirectShow codecs are actually malware. ("Download codec XYZ to play free porn!")
- DirectShow codecs are quite likely to have security holes. As those holes are uncovered, we will have to track the issues and often our only possible response will be to blacklist insecure codecs, since we can't fix them ourselves. If we blacklist enough codecs, DirectShow support becomes worthless.
- Each new video backend creates additional maintenance headaches as we evolve our internal video code.
So even if we didn't care about promoting unencumbered codecs and someone gave us a working patch, shipping DirectShow support in Firefox is of limited value and creates tons of maintenance work for us.
Some (but not all) of the same arguments apply to using Quicktime. But if we're going to ship our own video infrastructure for Windows, we save ourselves a lot of trouble by using that infrastructure across all platforms. It saves authors a lot of trouble too, due to less variability across Firefox versions.
Currently no browser or browser plugin vendor is using codec implementations they don't control. Apple does allow third-party codecs to be used with Quicktime in Safari --- but at least they control the framework and the important codecs, and apparently they have made some changes for Web video.
Friday, 19 June 2009
With the imminent release of Firefox 3.5 and the big step forward for unencumbered video and audio that this represents, there's been a lot of discussion about the merits of the free Ogg codecs vs the flagship encumbered codecs, especially H.264. Proponents of the encumbered codecs tend to focus on the technical advantages of patented techniques used in H.264 but not in Theora. But the real question that matters is this: at comparable bit rates, in real-world situations, do normal people perceive a significant quality advantage for H.264 over Theora? Because if they don't, theoretical technical advantages are worthless.
(It's not helpful to ask a video compression expert if they can see quality differences. Of course they can, they're trained to. That doesn't tell you what matters for the other 99.9999% of the population.)
One issue that has made a comparison difficult is that encoders are so tunable. People always say things like "oh, your H.264 would look much better if you set the J-frame quantization flux capacitor rank to 8". So Greg Maxwell had a stroke of genius and did a comparison using Youtube to do the encoding. Now, people who think the H.264 encoding is sub-par are placed in the difficult position of arguing that Youtube's engineers don't know how to encode H.264 properly.
In fact Youtube deliberately doesn't squeeze the most out of H.264 encoding. That's because in real life there are tradeoffs to consider beyond just quality and bit-rate, such as encoding cost and bit-rate smoothing. But that's fine, because these are the real-life situations we care about.
Maik Merten recently repeated Greg's experiment with different video in larger formats. In these tests, it seems pretty clear that there is no real advantage for H.264, or even that Theora is doing better.
We really need someone to do a scientific blind test with a wide pool of subjects and video clips (hello academic researchers!). But H.264 proponents have to demonstrate real-world advantages that justify surrendering software freedoms and submitting to MPEG-LA client and server license fees (not to mention the hassle of actually negotiating licenses and ensuring ongoing compliance).
Tuesday, 16 June 2009
Platforms like X and Win32 support trees of native windows. For example,
an application might have a "top-level" window containing child windows
representing controls such as buttons. In browsers, child windows are required
for many kinds of plugins (e.g., most uses of Flash). Managing these child
windows is quite tricky; they add performance overhead and the native
platform's support for event handling, z-ordering, clipping, graphical effects,
etc, often is not a good match for the behaviours required for Web content,
especially as the Web platform evolves. So I'm working hard to get rid of
the use of child windows and I've made a lot of progress --- more about that
later. However, we're stuck with using child windows for plugins, and recently
I've been grappling with one of the hardest problems involving child windows:
It's very hard to make scrolling smooth in the presence of child windows
for plugins, when all other child windows have been eliminated. In general
you want to scroll (using accelerated blitting) some sub-rectangle of the
document's window (e.g., an overflow:auto element), and some of the plugin
windows will move while others won't. You may want to change the clip region
of some of the plugins to reflect the fact that they're being clipped to the
bounds of the overflow:auto element. The big problem is that to avoid flicker
you want to move the widgets, change their clip regions, and blit the contents
of the scrolled area in one atomic operation. If these things happen
separately the window is likely to display artifacts as it passes through
undesired intermediate states. For example, if you blit the scrolled area
first, then move the plugins, the plugin will appear to slightly lag
behind as you scroll the window.
On Windows, href="http://msdn.microsoft.com/en-us/library/bb787593%28VS.85%29.aspx">
ScrollWindowEx with the SW_SCROLLCHILDREN flag gets you a long way. But
on X, the situation is dire. X simply has no API to scroll the contents of a
window including child windows! Toolkits like GTK resort to
heroic efforts such as
guffaw scrolling to achieve the effect, but those techniques don't let
you scroll a sub-rectangle of a window, only the entire window. So for Gecko
I have had to come up with something new.
Basically I want to use
XCopyArea and have it copy the contents of the window *and* the contents
of some of the child windows in the scrolled area, and then move the child
windows into their new locations and change their clip regions (I'm using
XShape for this) without doing any extra repainting. I tried a few things that
didn't work, before I hit a solution:
- Hide (unmap) all child windows that are going to be moved during the scroll
- Do the XCopyArea
- Set the clip region and position of all relevant child windows
- Show (map) all child windows that we hid in step 1
By hiding the child windows during the XCopyArea, we trick the X server into
treating the pixels currently on-screen for them as part of the parent window,
so they are copied. By moving and setting the clip region while each child
window is hidden, we also avoid artifacts due to a child window briefly
appearing outside the area it's supposed to be clipped to. It *does* mean that when we scroll a window containing plugins, expose events are generated to repaint the contents of the plugins' child windows, so scrolling is slower than it could be. We might be able to avoid that by enabling backing store for the plugin windows.
It's somewhat tricky to implement this. We have to compute the desired
child window locations and clip regions, store the results in a collection,
and pass that collection into our platform window abstraction's Scroll() API
so that the scrolling and widget configuration can happen as a unified
operation. But I've done it, and it works great on my Ubuntu VM. I'm not
100% sure that it will work well on all X servers and configurations; that's
something we'll need to get good testing of.
I do wonder why X has never felt the need to grow a genuinely useful
scrolling API. Oh well.
Tuesday, 26 May 2009
Since the MoCo all-hands at the beginning of the month, I've been mostly focused on getting 3.5 out the door. And that has mostly meant triaging and fixing a variety of video/audio bugs. I'm glad to report that that has gone very well! Over a couple of weeks we fixed a large number of bugs, a lot more than I expected we'd be able to get done for 3.5. Some highlights:
- We've put video/audio decoding and playback into their own threads, so decoding keyframes for very large videos (which can take hundreds of milliseconds) does not cause any skipping
- We fixed A/V sync to follow the hardware clock on Mac and Linux, fixing sync issues on those platforms
- By default we stop loading a video after we've got enough data to display the first frame, unless overridden by 'autoplay', 'autobuffer' or script requests; we also fire the 'suspend' event when appropriate
- We implemented support for Theora's non-square pixel aspect ratios
- We implemented support for Theora's clipping of the video display area
- We made the time scale for an Ogg stream always appear to start at 0 as far as scripts are concerned, even if the stream's internal timestamps start at a nonzero time
- We made seeking within data ranges already downloaded in the media cache pretty much instantaneous (seeking into unbuffered data is still slow unfortunately)
- We implemented seeking to any intermediate frame without visual artifacts
As well as all that, we also fixed a bunch of other stability and spec-compliance bugs. We seem to be in pretty good shape for open video in 3.5.
I took the weekend off (strange phrase, that) and our family took the train down to Wellington on Saturday and flew back today (Monday) morning. The trip down was great, saw a bit of snow in the volcanic plateau and lots of great countryside. Wellington's weather forecast was "cold southerly gales" and that's exactly what we got! It was exciting to experience Wellington at its Wellingtonest, especially when we took the bus to Lyall Bay and got blown away by the wild wind and waves there. It was a great trip.
This week I should be back working on compositor stuff. I've done a lot of work to change the way native widgets are handled and now I'm working on widget-less IFRAMes. Hopefully some stuff will be ready for checkin soon --- I'll talk more about it then.
Friday, 8 May 2009
The MoCo All-Hands was tons of fun. I'm a bit annoyed that bits of our project planning ended up on Slashdot within a few days --- the perils of being open! Why doesn't this happen to other projects? Anyway, I felt the time was quite productive beyond just feeding Slashdot. With a couple of exceptions, meetings were goal-oriented and we were able to reach decisions. Of course it was also great to meet a number of new MoCo developers and other staff.
I also took the chance to meet up with friends, as I often do. So many of my friends have gravitated to the Bay Area, I can only visit a subset of them on every trip, which is a bit awkward! Still it's great that I get to seem them.
I drove up to Berkeley a couple of times. On Saturday I met up with my friend and we visited the Asian Art Museum in San Francisco, but spent most of the time sitting in the museum cafe talking about browser security! On Tuesday I met with a number of faculty, students and post-docs to talk about their research. One thing I kept emphasizing was that too much security research is really research in *insecurity* --- finding all kinds of new attacks! We already know there are lots of problems and at this point, we need more emphasis on defense. Unfortunately defense is harder and less amenable to crisp results. But we talked about how bug-finding technology could be used to speed up the bug fix cycle, by applying the technology to improve debugging and to search for regressions that would be caused by a patch. I'm hopeful we'll see progress in those areas.
I really enjoyed the trip, although I'm finding that every time away I miss my family a little bit more. The flights were good. Movie picoreviews:
- Yes Man: OK
- Quantum of Solace: pretty good
- Valkyrie: OK (I wanted to like it more for the history and the compelling situation, but there seemed to be some unexplained important jumps)
- Taken: mediocre
- Defiance: OK
- In Bruges: brilliant!
Saturday, 25 April 2009
On Sunday I'm heading off to California for the Mozilla Corporation all-hands meeting. I'll be in the Bay Area from Sunday afternoon until the evening of May 6, arriving back in NZ on the morning of May 8 thanks to crossing the date line. It should be fun! I'll be online at times and should be able to keep up with my regular code reviewing and other duties.
Techno-futurist poster boys like Ray Kurzweil and Hans Moravec have long predicted that increasing processor capacity will inevitably lead to "strong AI", and soon. But they overlooked the truth of computer science that making software do what you want is much more difficult than building the necessary hardware. Of course, one way around this is to emulate existing software on new hardware, and it looks like there is interesting progress in that direction.
What will it mean if we are able to effectively simulate a human brain in a computer, so that the mind actually works? The practical implications are huge: immortality for the simulated minds, cloning minds, potentially much faster cognition, easy long-distance space travel, some scary consequences for the simulatee if the machine is compromised. But it's unclear if we'd be able to make fundamental improvements to the capabilities of human cognition. We are dealing with an exceptionally large and strange piece of legacy code.
Philosophically, it would be a blow against some forms of dualism. That doesn't bother me because personally I'm not much of a dualist. I grew up with the assumption that the mind is what the brain does, and becoming a Christian did not require me to change that. God knows everyone's mental state at all times, so resurrection only requires that state to be copied into a new body. Any consciousness in the interregnum can be achieved by God doing the necessary processing himself. (Better thinkers than I have speculated along these lines.) The idea of a soul independent of the body that carries on on its own is not required in the Bible, as far as I can see, although the Gnostics probably would held it.
I doubt we're anywhere close to actually achieving the simulated brain, however. We should expect many iterations of discovering that there's some subtle but important phenomenon not being simulated. And who knows, maybe Penrose will turn out to be right and some quantum phenomena turn out to be important...
Sunday, 12 April 2009
I was out of the office most of last week, at Auckland University attending NZCSRSC '09. It's a conference dedicated to presenting the work of CS grad students from around New Zealand. I didn't really know what it would be like, but once I heard about it I thought it would be interesting to attend, to do some low-level recruiting/networking and to get a feel for the NZ CS research scene. Other than that I had pretty low expectations. I'm glad to report my expectations were exceeded!
I had lots of chances to talk to students, most of whom had no prior knowledge of our little Mozilla operation here.
The keynotes were actually quite good ... I've seen plenty of rubbish keynotes at top conferences (often where some extremely senior leader in the field talks about work that was interesting twenty years ago...), but at this conference Alan Blackwell and JP Lewis actually provided fresh insights, the former on collaboration across academic disciplines and the latter on collaboration between academia and industry. These are topics that get talked about a lot but too often without drawing on actual experiences, as Blackwell and Lewis were able to.
The student talks were a mixed bag, as one would expect. Veronica Liesaputra's talk on book models was one that stood out to me, probably because it's quite relevant to the design of future CSS features. Another good one was Susanne Tak's user studies of task switching interfaces (e.g. Windows' Alt-Tab) which are probably relevant to browser tab switching. A lot of the other talks seemed to be solving non-problems, or tackling problems of unreasonably large scope, or doing surveys without real synthesis, or coding stuff up with no real research agenda, or just addressing problems of little interest. A pretty good sample of most CS research...
I think I was the only "industry" person there who wasn't actually sponsoring the conference, which made me feel a little awkward at times, but then it doesn't take much to make me feel awkward. I was definitely glad I went though.
Friday, 10 April 2009
Friday, 3 April 2009
I've just checked in the media data cache to Gecko trunk. It should appear in nightly builds tonight and on the 1.9.1 branch soon-ish. If you just can't wait to try it, there are try-server builds for Linux, Mac and Windows.
Media applications want fast, "on demand" random access to media data, for pausing, seeking, etc --- something like the standard "seek/read" OS file APIs. But we are primarily interested in transporting media data using HTTP over the Internet, which has high latency to open a connection, requires a new connection for every seek, may not even support seeking on some connections (especially live streams), and uses a push model --- data comes from the server and you don't have much control over the rate. Also, transferring data over the Internet can be slow and/or unpredictable, so we want to read ahead to buffer and cache as much data as possible.
The job of the media cache is to resolve this impedance mismatch. The media cache reads data from Necko channels into file-backed storage, and offers a random-access file-like API to the stream data (nsMediaCacheStream). Along the way it solves several problems:
- The cache intelligently reads ahead to prefetch data that may be needed in the future.
- The size of the cache is bounded so that we don't fill up storage with read-ahead data.
- The cache stores played data as well as read-ahead data so that backward seeks and replay can be served without redownload.
- Cache replacement is managed globally so that the most valuable data (across all streams) is retained.
- The cache can suspend Necko channels temporarily when their data is not wanted (yet).
- The cache translates file-like seek requests to HTTP seeks, including optimizations like not triggering a new seek if it would be faster to just keep reading until we reach the seek point, and adjusting the seek destination to avoid re-downloading data already in the cache. The "seek to EOF" idiom to determine file size is also handled efficiently (seeking to EOF and then seeking back to the previous offset does not trigger any Necko activity).
- The cache also handles the case where the server does not support seeking.
- Necko can only send data to the main thread, but nsMediaCacheStream can distribute data to any thread.
- The cache exposes APIs so clients can detect what data is currently cached.
(Our previous strategy was incredibly naive: just read all the incoming data from the HTTP stream and store it in memory, and throw it away after it has been played once; seeks throw away all data and create a new HTTP stream.)
Although HTTP is the most important transport and we only support transport-level seeking via HTTP byte-ranges, the media cache works with any kind of Necko channels and provides random access to cached data even for, e.g., FTP streams.
The media cache is not persistent. It does not currently allow data from one load to be used by other loads, either within the same browser session or across browser sessions.
The media cache is block-based. Streams are divided into blocks of a fixed size (currently 4K) and we cache blocks. A single cache contains blocks for all streams. The cache size is controlled by the media.cache_size preference (which is in KB). The default size is 50MB.
The replacement policy predicts a "time of next use" for each block in the cache. When we need to free a block, the block with the latest "time of next use" will be evicted. Blocks are divided into different classes, each class having its own predictor:
- FREE_BLOCK: these blocks are effectively infinitely far in the future; a free block will always be chosen for replacement before other classes of blocks.
- METADATA_BLOCK: these are blocks that contain data that has been read by the decoder in "metadata mode", e.g. while the decoder is searching the stream during a seek operation. These blocks are managed with an LRU policy; the "time of next use" is predicted to be as far in the future as the last use was in the past.
- PLAYED_BLOCK: these are blocks that have not been read in "metadata mode", and contain data behind the current decoder read point. (They may not actually have been read by the decoder, if the decoder seeked forward.) These blocks are managed with an LRU policy except that we add REPLAY_DELAY seconds of penalty to their predicted "time of next use", to reflect the uncertainty about whether replay will actually happen or not.
- READAHEAD_BLOCK: these are blocks that have not been read in "metadata mode" and that are entirely ahead of the current decoder read point. (They may actually have been read by the decoder in the past if the decoder has since seeked backward.) We predict the time of next use for these blocks by assuming steady playback and dividing the number of bytes between the block and the current decoder read point by the decoder's estimate of its playback rate in bytes per second. This ensures that the blocks farthest ahead are considered least valuable.
For efficient prediction of the "latest time of next use", we maintain linked lists of blocks in each class, ordering blocks by time of next use. READAHEAD_BLOCKS have one linked list per stream, since their time of next use depends on stream parameters, but the other lists are global.
"Time of next use" estimates are also used for flow control. When reading ahead we can predict the time of next use for the data that will be read. If the predicted time of next use is later then the prediction for all currently cached blocks, and the cache is full, then we should suspend reading from the Necko channel.
Unfortunately suspending the Necko channel can't immediately stop the flow of data from the server. First our desire to suspend has to be transmitted to the server (in practice, Necko stops reading from the socket, which causes the kernel to shrink its advertised TCP receive window size to zero). Then the server can stop sending the data, but we will receive data roughly corresponding to the product of the link bandwidth multiplied by the round-trip latency. We deal with this by letting the cache overflow temporarily and then trimming it back by moving overflowing blocks back into the body of the cache, replacing less valuable blocks as they become available. We try to avoid simply discarding overflowing readahead data.
All changes to the actual contents of the cache happen on the main thread, since that's where Necko's notifications happen.
The media cache maintains at most one Necko channel for each stream. (In the future it might be advantageous to relax this, e.g. so that a seek to near the end of the file can happen without disturbing the loading of data from the beginning of the file.) The Necko channel is managed through nsMediaChannelStream; nsMediaCache does not depend on Necko directly.
Every time something changes that might affect whether we want to read from a Necko channel, or whether we want to seek on the Necko channel --- such as data arriving or data being consumed by the decoder --- we asynchronously trigger nsMediaCache::Update on the main thread. That method implements most cache policy. It evaluates for each stream whether we want to suspend or resume the stream and what offset we should seek to, if any. It is also responsible for trimming back the cache size to its desired limit by moving overflowing blocks into the main part of the cache. The policy mechanism is fairly simple and stateless; it examines the current state of the cache and of all consumers, and decides what each stream should do at this moment (i.e., should the stream be trying to read data and if so, at what offset should it be trying to read).
Streams can be opened in non-seekable mode. In non-seekable mode, the cache will only call nsMediaChannelStream::CacheClientSeek with a 0 offset. The cache tries hard not to discard readahead data for non-seekable streams, since that could trigger a potentially disastrous re-read of the entire stream. It's up to cache clients to try to avoid requesting seeks on such streams.
The media cache is primarily designed to support HTML5 media elements, but potentially it could be useful in other places where we want a mix of random and sequential access to large files accessed over HTTP.
Tuesday, 24 March 2009
I've finally started using mq (Mercurial Queues) routinely. It fits the way I work very well. When I have several bugs on the boil I can now easily track the individual patches. I can also easily work with big changes that are factored into multiple patches. When one piece of work depends on other pieces whose patches haven't landed yet, it's so easy to put the prerequisite patches in the queue and keep making progress without getting into patch management hell (or branch merge hell). This is increasing my productivity, and it's fun too.
Another great development is tests on the try-servers. Now you can upload a patch to the try-server farm and it will build the patch on all three tier-1 platforms, produce installation packages, and run all our automated tests on the tier-1 platforms!
What makes it all really seductive is "push-to-try". If you've already got your tree patched with some mq patches, all you have to do is
hg push -f ssh://hg.mozilla.org/try/ and eventually all the builds and test results will appear automatically! It couldn't be easier.
The only problem is that the try-servers are now so convenient and useful, they're constantly swamped with developers taking advantage of them :-). That's a good problem to have. As Jonas noted the other day, we don't need a server pool, now we need a server ocean. A success disaster in the making! :-)
Sunday, 22 February 2009
This video from PDC 2008 is interesting; Alex Mogilevsky talks about IE8's new layout engine. It was generous of him to mention that the subpixel units and zoom approach first appeared in Firefox! It's interesting to get a little glimpse under the hood. It sounds like they have something much like our frame display lists. The glimpse we got of their "layout object dump" and "display list dump" output was quite revealing if you know what you're looking for, but I'll leave that as an exercise for the reader :-).
On the flip side, the decision to make the layout code reusable for apps like Word seems dubious, and I fancy I could sense him doubting it too. And Alex seemed remarkably sanguine about supporting an IE8-mode in IE9; "putting an 'if' statement around every bug fix" is far from the whole truth and would itself be a big maintenance burden. I stand by my skepticism of their whole approach here.
Saturday, 21 February 2009
Daniel Glazman writes, somewhat in passing:
We can now apply SVG filters to all content, including HTML 4, in Firefox but I have mixed feelings about the fact you have to know HTML, CSS and SVG to apply such a filter. Seems to me overkill, and too far away from the original spirit of the Web.
He didn't enable comments on that post, so I'll reply here: you don't need to know SVG to use filters. You can just paste filter code from a cookbook into an XML file --- or use an XML file from a library --- and refer to that file from your CSS and HTML. This is not only easy and reasonable; SVG filters provide a rich vocabulary of composable operators, so they're much more flexible and powerful than providing ten canned filter keywords to handle all the author needs that the CSS WG can think of.
You could perhaps reinvent SVG filters with CSS syntax, but I don't see that as a real win for author understanding.
Generally in computer science we prefer to create a set of composable primitives with clean semantics, on top of which libraries are built to make frequent tasks easy, over trying to anticipate every possible need and encoding a feature for each one. Unfortunately CSS has often taken the latter course ... partly because the Web's mechanisms for code reuse are weak. But external document references give us a reuse mechanism that makes using a library of SVG effects quite convenient, IMHO.
Friday, 20 February 2009
Today I had a fun lunch with Web developers at Shift (not all of whom work there). It was a follow-up on a short discussion at Kiwi Foo about columns on the Web. My goal was to brainstorm with some Web developers/graphic designers/user experience people about ways to display lots of text on large screens, and in particular to tackle the difficult problem of what to do when the text can't all fit.
We talked through various options. There was quite a lot of interest in using fixed-height columns and adding columns horizontally for overflowing content, and having users scroll horizontally to see the content. This is pretty well supported by the CSS3 columns spec and the Gecko implementation, although to make it more useful we'd have to work on horizontal scrolling UI and implement viewport units so an element's height can be related to the window size.
There was also quite a lot of interest in using fixed-height columns but adding overflowing columns vertically below the first set of columns, giving the effect of a set of pages laid out top to bottom. This is currently not supported by the spec or our implementation, although it wouldn't be too hard to add. This would require more UI work, to make it easy to scroll up or down by exactly the height of one of these pages. Continuous scrolling mechanisms like mousewheels would be hard to integrate.
Either of those approaches could be modified to try to avoid scrollbars and instead offer a UI based on pagination --- "next page", "previous page" buttons or even "next column", "previous column".
Another idea that Ross Howard raised, one I hadn't thought about, is to have scrolling actually happen *within* columns. So there's a fixed number of columns of fixed height; if content overflows, then initially it's not visible beyond the last column. But there's a UI "scroll down" action which removes content from the first column and pulls the rest of the content "up" through the columns to fit.
A good point that was raised during these discussions is that smooth scrolling is important to help users not lose their place during scrolling. It's not clear how that would work with the scroll-within-columns idea, though, since moving content within columns has to change the layout at column boundaries to choose tasteful column break positions.
It would be cool to provide support for a variety of options, but I need to think about the right syntax for expressing them and how hard various options would be to implement. Adding support for viewport units would be useful yet relatively easy (a brand-new volunteer, Keith Rarick, implemented 'rem' units recently). Adding some kind of "series of pages" feature would have many uses and wouldn't be too hard to implement, it's a lot like columns that stack vertically instead of horizontally. On the other hand the "scroll within columns" feature needs a lot more thought. The scrolling UI issues need more thought too.
One person asked me why I was interested in this problem. Partly it's because Web developers want a solution. Partly it's because screens have been getting bigger, on the desktop anyway, and letting users use them effectively is serves users better. It's also partly because proprietary platforms like Flash and WPF have some of these features and the open Web needs to keep up.
We had side discussions about fonts, typography and analytics. These Web developers would really love some data about how certain browser commands are used by visitors to their sites.
Thanks to Ross and co for a fun and helpful discussion, and the pizza!
I forgot to mention this earlier, but Boris Zbarsky a while ago implemented SVG external document references, so that SVG features such as filter, mask, clipPath, use, fill, stroke, marker, and textPath can refer to elements in other documents. A nice benefit is that our support for SVG effects for HTML can now be very easily applied to Plain Old HTML documents. For example, check out this example in a Firefox 3.1 trunk build. The key piece of magic is
Wednesday, 18 February 2009
This is the scariest thing I've read in a long time. If the leading lights in synthetic biology are this naive, humanity is toast. The idea that we can prevent abuse while democratising access, by bonding researchers into a loving utopian community, is laughable. The comparison to computers is completely inappropriate. Thomas Lord's comment near the bottom is a pretty good summary of the problems here.
I especially dislike the rhetorical technique that treats suppressing a technology as de facto absurd. On the contrary, if a technology poses a serious danger to humanity, then it absolutely should be suppressed, or at least tightly controlled and not "democratised". We've done it for nuclear weapons with moderate success.
God have mercy on us.
I hear a lot about online "conversations". In blogs, in message boards, in email, people seem to want to have a lot of "conversations". Around Foo Camp, there was talk about how to "keep the conversations going". Even faceless corporate entities want to have "conversations" with me.
I do not want "conversations". Conversations are part of the information overload that saps my productivity. I want to engage, communicate, resolve, and move on. If we keep needing to have conversations then we are not communicating or something else is not working.
Exception Family members and friends, this does not apply to you.