Sunday, 29 January 2012

Mozilla Tree Adventures

Last week we had a few overseas Mozillians stop by the office on their way back from We took the opportunity to hold our annual Mozilla Auckland outdoors event, which for only the second time ever did not involve volcanoes. Instead we went to Tree Adventures out in Woodhill Forest. Basically you strap on safety gear, climb ladders into the trees (up to 14 metres above the ground) and conquer a variety of obstacle courses on wires from tree to tree. Flying foxes are also involved. It's tons of fun, unless you have a fear of heights as I do, in which case it's tons of fun mixed with just as much sheer terror. I was glad to be able to overcome my instincts and keep going; that sort of mental discipline is worth practising. I just hope none of my colleagues noticed me whimpering.

We followed up with a nice lunch at Hallertau.

Wednesday, 18 January 2012

You Know You're In Australia When...

... you take a short walk after dinner and encounter a tribe of kangaroos.

Tuesday, 17 January 2012

MediaStreams Processing Demos

I'm at at the moment (until Wednesday) and yesterday I attended the browsers miniconf. It went well, better than I expected. I had a slot to talk about the MediaStreams Processing API proposal to enable advanced audio effects (and much more!) in browsers, which has been my main project for the last several months (see my earlier post here. I worked frantically up to last minute to create demos of some of the most interesting features of the API, and get my implementation into a state where it can run the demos. By the grace of God I was successful :-). Even more graciously, the audio in the conference room worked and even played my stereo effects properly!

I have made available experimental Windows and Mac Firefox builds with most of the MediaStreams Processing API supported. (But the Mac builds are completely untested!) The demos are here. Please try them out! I hope people view the source, modify the demos and play with the API to see what can be done. Comments on the API should go to me or to the W3C Audio Working Group.

I must apologise for the uninspired visual design and extraordinarily naive audio processing algorithms. Audio professionals who view the source of my worker code will just laugh --- and hopefully be inspired to write better replacements :-). Making that easy for anyone to do is one of my goals.

Some of the things I like about this API:

  • First-class support for JS-based processing. In particular, JS processing off the main thread, using Workers. This lets people build whatever effects they want and get reasonable performance. Soon we'll have something like Intel's River Trail in browsers and then JS users will be able to get incredible performance.
  • Leverages MediaStreams. Ongoing work on WebRTC and elsewhere is introducing MediaStreams as an abstraction of real-time media, and linking them to sources and sinks to form a media graph. I don't think we need another real-time media graph in the Web platform.
  • Allows processing of various media types. MediaStreams currently carry both audio and video tracks. At the moment the API only supports processing of the audio because we don't have graphics APIs available in Workers to enable effective video processing, but that will change. Applications will definitely want to process video in real time (e.g. QR code recognizer, motion detection and other "augmented reality" applications). Soon we'll want Kinect depth data and other kinds of real-time sensor data.
  • First-class synchronization. Some sources and effects have unbounded latency. We want to make sure we maintain A/V sync in the face of latency or dynamic graph changes. This should be automatic so authors don't have to worry about it.
  • Support for streams with different audio sample rates and channel configurations in the same graph. This is important for efficient processing when you have a mix of rates and some of them are low. (All inputs to a ProcessedMediaStream are automatically resampled to the same rate and number of chnanels to simplify effect implementations.)
  • No explicit graph or context object. It's not needed.

Most of the features in the proposed spec are implemented. Notable limitations:

  • "blockInput" and "blockOutput" are not implemented; there is no way for streams to opt out of being synchronized. For example it would be nice to be able to pipe a media resource into a processing node and if the resource pauses (e.g. due to a network delay), the processing node doesn't block but just treats the paused input as silence. This is probably the trickiest feature not yet implemented.
  • No support for "live" streams. Similar to above, if a stream feeds into an output node that is blocked, we sometimes don't want to buffer the input stream. E.g. if the input is a live webcam you often (but not always) want to throw away buffered data so that when the output unblocks it immediately gets the latest video frames.
  • There has been very little tuning to optimize throughput and latency, especially across a range of devices. This will be a lot of work.
  • In general the API is very lightly tested. I'm sure there are lots of bugs.
  • Video elements don't play in sync with streams captured from them. In my demos I worked around this by hiding the source video elements and creating new video elements to play the video via the stream. Fixing this bug would simplify the demos a bit.
  • Canvas video sources are not implemented.
  • The built in audio resampler is stupendously naive and needs to be replaced.
  • Add support multiple audio and video tracks and the MediaStream track API.
  • ProcessedMediaStreams using JS workers need to add checks ensuring that all upstream media sources are same-origin.
  • The biggest limitation is that it's not shipping in Firefox yet. My giant patch is messy and a lot of cleanup needs to be done. I have a plan to split the patch up, clean up the pieces and land them piecemeal. In particular I need to get some of the infrastructure landed ASAP to help the WebRTC team make progress. (When we ship it, much or all of the API will probably be disabled by default, behind a hidden pref, until the standards situation is resolved.)

    Update Updated the build links to point to new builds with improved performance (faster JS execution in workers due to type inference being turned on; fewer control loop wakeups due to more intelligent buffering decisions for ProcessedMediaStreams).

Wednesday, 11 January 2012

"Cut The Rope" and HTML5 Audio

Microsoft released an HTML5 version of Cut The Rope which is pretty cool. Unfortunately they use Flash audio by default for Firefox users because, they say, "some Firefox users could have run into an audio problem but will notice we fall back to a flash plugin to ensure that sound effects and music will work." They don't mention specific Firefox bugs (although they do for Chrome), and when I try the HTML5 audio version it works fine for me. So, please try the HTML5 version in Firefox (release or nightly), and if it doesn't work let me know and file bugs! Thanks!

Saturday, 7 January 2012

Risk Tolerance

Wise words:

John McGlashan principal Michael Corkery said: "He was doing what every kid should be doing, playing with his friends in a river."

Everybody had done everything right, but Dion still died.

"It was just a tragedy, and tragedies happen. Life sometimes deals out bad luck but we have to get on with it. We'll never forget him."

Too often, people respond to a child's tragic death by setting up a pressure group, foundation or new law to Make Sure This Never Happens Again.