On Sunday I'm heading off to California for the Mozilla Corporation all-hands meeting. I'll be in the Bay Area from Sunday afternoon until the evening of May 6, arriving back in NZ on the morning of May 8 thanks to crossing the date line. It should be fun! I'll be online at times and should be able to keep up with my regular code reviewing and other duties.
Techno-futurist poster boys like Ray Kurzweil and Hans Moravec have long predicted that increasing processor capacity will inevitably lead to "strong AI", and soon. But they overlooked the truth of computer science that making software do what you want is much more difficult than building the necessary hardware. Of course, one way around this is to emulate existing software on new hardware, and it looks like there is interesting progress in that direction.
What will it mean if we are able to effectively simulate a human brain in a computer, so that the mind actually works? The practical implications are huge: immortality for the simulated minds, cloning minds, potentially much faster cognition, easy long-distance space travel, some scary consequences for the simulatee if the machine is compromised. But it's unclear if we'd be able to make fundamental improvements to the capabilities of human cognition. We are dealing with an exceptionally large and strange piece of legacy code.
Philosophically, it would be a blow against some forms of dualism. That doesn't bother me because personally I'm not much of a dualist. I grew up with the assumption that the mind is what the brain does, and becoming a Christian did not require me to change that. God knows everyone's mental state at all times, so resurrection only requires that state to be copied into a new body. Any consciousness in the interregnum can be achieved by God doing the necessary processing himself. (Better thinkers than I have speculated along these lines.) The idea of a soul independent of the body that carries on on its own is not required in the Bible, as far as I can see, although the Gnostics probably would held it.
I doubt we're anywhere close to actually achieving the simulated brain, however. We should expect many iterations of discovering that there's some subtle but important phenomenon not being simulated. And who knows, maybe Penrose will turn out to be right and some quantum phenomena turn out to be important...
Sunday, 12 April 2009
I was out of the office most of last week, at Auckland University attending NZCSRSC '09. It's a conference dedicated to presenting the work of CS grad students from around New Zealand. I didn't really know what it would be like, but once I heard about it I thought it would be interesting to attend, to do some low-level recruiting/networking and to get a feel for the NZ CS research scene. Other than that I had pretty low expectations. I'm glad to report my expectations were exceeded!
I had lots of chances to talk to students, most of whom had no prior knowledge of our little Mozilla operation here.
The keynotes were actually quite good ... I've seen plenty of rubbish keynotes at top conferences (often where some extremely senior leader in the field talks about work that was interesting twenty years ago...), but at this conference Alan Blackwell and JP Lewis actually provided fresh insights, the former on collaboration across academic disciplines and the latter on collaboration between academia and industry. These are topics that get talked about a lot but too often without drawing on actual experiences, as Blackwell and Lewis were able to.
The student talks were a mixed bag, as one would expect. Veronica Liesaputra's talk on book models was one that stood out to me, probably because it's quite relevant to the design of future CSS features. Another good one was Susanne Tak's user studies of task switching interfaces (e.g. Windows' Alt-Tab) which are probably relevant to browser tab switching. A lot of the other talks seemed to be solving non-problems, or tackling problems of unreasonably large scope, or doing surveys without real synthesis, or coding stuff up with no real research agenda, or just addressing problems of little interest. A pretty good sample of most CS research...
I think I was the only "industry" person there who wasn't actually sponsoring the conference, which made me feel a little awkward at times, but then it doesn't take much to make me feel awkward. I was definitely glad I went though.
Friday, 10 April 2009
Friday, 3 April 2009
I've just checked in the media data cache to Gecko trunk. It should appear in nightly builds tonight and on the 1.9.1 branch soon-ish. If you just can't wait to try it, there are try-server builds for Linux, Mac and Windows.
Media applications want fast, "on demand" random access to media data, for pausing, seeking, etc --- something like the standard "seek/read" OS file APIs. But we are primarily interested in transporting media data using HTTP over the Internet, which has high latency to open a connection, requires a new connection for every seek, may not even support seeking on some connections (especially live streams), and uses a push model --- data comes from the server and you don't have much control over the rate. Also, transferring data over the Internet can be slow and/or unpredictable, so we want to read ahead to buffer and cache as much data as possible.
The job of the media cache is to resolve this impedance mismatch. The media cache reads data from Necko channels into file-backed storage, and offers a random-access file-like API to the stream data (nsMediaCacheStream). Along the way it solves several problems:
- The cache intelligently reads ahead to prefetch data that may be needed in the future.
- The size of the cache is bounded so that we don't fill up storage with read-ahead data.
- The cache stores played data as well as read-ahead data so that backward seeks and replay can be served without redownload.
- Cache replacement is managed globally so that the most valuable data (across all streams) is retained.
- The cache can suspend Necko channels temporarily when their data is not wanted (yet).
- The cache translates file-like seek requests to HTTP seeks, including optimizations like not triggering a new seek if it would be faster to just keep reading until we reach the seek point, and adjusting the seek destination to avoid re-downloading data already in the cache. The "seek to EOF" idiom to determine file size is also handled efficiently (seeking to EOF and then seeking back to the previous offset does not trigger any Necko activity).
- The cache also handles the case where the server does not support seeking.
- Necko can only send data to the main thread, but nsMediaCacheStream can distribute data to any thread.
- The cache exposes APIs so clients can detect what data is currently cached.
(Our previous strategy was incredibly naive: just read all the incoming data from the HTTP stream and store it in memory, and throw it away after it has been played once; seeks throw away all data and create a new HTTP stream.)
Although HTTP is the most important transport and we only support transport-level seeking via HTTP byte-ranges, the media cache works with any kind of Necko channels and provides random access to cached data even for, e.g., FTP streams.
The media cache is not persistent. It does not currently allow data from one load to be used by other loads, either within the same browser session or across browser sessions.
The media cache is block-based. Streams are divided into blocks of a fixed size (currently 4K) and we cache blocks. A single cache contains blocks for all streams. The cache size is controlled by the media.cache_size preference (which is in KB). The default size is 50MB.
The replacement policy predicts a "time of next use" for each block in the cache. When we need to free a block, the block with the latest "time of next use" will be evicted. Blocks are divided into different classes, each class having its own predictor:
- FREE_BLOCK: these blocks are effectively infinitely far in the future; a free block will always be chosen for replacement before other classes of blocks.
- METADATA_BLOCK: these are blocks that contain data that has been read by the decoder in "metadata mode", e.g. while the decoder is searching the stream during a seek operation. These blocks are managed with an LRU policy; the "time of next use" is predicted to be as far in the future as the last use was in the past.
- PLAYED_BLOCK: these are blocks that have not been read in "metadata mode", and contain data behind the current decoder read point. (They may not actually have been read by the decoder, if the decoder seeked forward.) These blocks are managed with an LRU policy except that we add REPLAY_DELAY seconds of penalty to their predicted "time of next use", to reflect the uncertainty about whether replay will actually happen or not.
- READAHEAD_BLOCK: these are blocks that have not been read in "metadata mode" and that are entirely ahead of the current decoder read point. (They may actually have been read by the decoder in the past if the decoder has since seeked backward.) We predict the time of next use for these blocks by assuming steady playback and dividing the number of bytes between the block and the current decoder read point by the decoder's estimate of its playback rate in bytes per second. This ensures that the blocks farthest ahead are considered least valuable.
For efficient prediction of the "latest time of next use", we maintain linked lists of blocks in each class, ordering blocks by time of next use. READAHEAD_BLOCKS have one linked list per stream, since their time of next use depends on stream parameters, but the other lists are global.
"Time of next use" estimates are also used for flow control. When reading ahead we can predict the time of next use for the data that will be read. If the predicted time of next use is later then the prediction for all currently cached blocks, and the cache is full, then we should suspend reading from the Necko channel.
Unfortunately suspending the Necko channel can't immediately stop the flow of data from the server. First our desire to suspend has to be transmitted to the server (in practice, Necko stops reading from the socket, which causes the kernel to shrink its advertised TCP receive window size to zero). Then the server can stop sending the data, but we will receive data roughly corresponding to the product of the link bandwidth multiplied by the round-trip latency. We deal with this by letting the cache overflow temporarily and then trimming it back by moving overflowing blocks back into the body of the cache, replacing less valuable blocks as they become available. We try to avoid simply discarding overflowing readahead data.
All changes to the actual contents of the cache happen on the main thread, since that's where Necko's notifications happen.
The media cache maintains at most one Necko channel for each stream. (In the future it might be advantageous to relax this, e.g. so that a seek to near the end of the file can happen without disturbing the loading of data from the beginning of the file.) The Necko channel is managed through nsMediaChannelStream; nsMediaCache does not depend on Necko directly.
Every time something changes that might affect whether we want to read from a Necko channel, or whether we want to seek on the Necko channel --- such as data arriving or data being consumed by the decoder --- we asynchronously trigger nsMediaCache::Update on the main thread. That method implements most cache policy. It evaluates for each stream whether we want to suspend or resume the stream and what offset we should seek to, if any. It is also responsible for trimming back the cache size to its desired limit by moving overflowing blocks into the main part of the cache. The policy mechanism is fairly simple and stateless; it examines the current state of the cache and of all consumers, and decides what each stream should do at this moment (i.e., should the stream be trying to read data and if so, at what offset should it be trying to read).
Streams can be opened in non-seekable mode. In non-seekable mode, the cache will only call nsMediaChannelStream::CacheClientSeek with a 0 offset. The cache tries hard not to discard readahead data for non-seekable streams, since that could trigger a potentially disastrous re-read of the entire stream. It's up to cache clients to try to avoid requesting seeks on such streams.
The media cache is primarily designed to support HTML5 media elements, but potentially it could be useful in other places where we want a mix of random and sequential access to large files accessed over HTTP.