Friday 14 September 2007
My mate Ras Bodik has just produced a blog covering his new "parallel browsing" project. It's a great read, especially the slides (which don't display correctly in Mac's Preview, but do in Acrobat Reader). There's some great observations and ideas in there, plus some neat preliminary results about parallel lexing. The bottom line is that future generations of devices will need a web browser that can efficiently utilize tens to hundreds of CPU cores.
This is extremely challenging for many reasons:
- Existing browser engines are written in C++, which offers only a classic multithreaded parallel programming model: threads, shared memory, locks, semaphores, etc, plus hand-rolled lock-free primitives. These tools are very difficult to use and mostly don't scale very well. There is no direct support for optimistic concurrency.
- Amdahl's Law means that cherry-picking a few optimization opportunities won't be enough to scale to a large number of cores. We will have to work parallelism through every stage of the browser's processing.
- It's not at all clear how much of what the browser does can be parallelised. Ras' group's work on parallel lexing is a small first step. Can one write an HTML5 parser in parallel, and if so, how? How about JS/DOM manipulation? Layout? Rendering? The programming model we expose to Web developers is inherently single-threaded and I don't think we should change that.
- Massive changes to browser code (including rewriting it in a different language) are very risky, and can damage compatibility credibility.
- Our debugging and profiling tools are already crappy for mostly single-threaded code. Debugging and profiling parallel code will be much harder.
- Browsers rely on a lot of platform library code. That code also needs to be refitted for scalable parallelism or it will become a bottleneck.
Bottom line: writing a good browser engine is already really hard. Requiring scalable parallelism adds a whole new dimension of engineering difficulty, and poses a bunch of research-level questions which we don't yet have answers for.
I plan to blog about my thoughts on these issues and how they might be solved. But my impression is that it will be very difficult for existing browsers to move into multicore territory, due to language choices, the difficulty of refitting existing code, compatibility with existing code consumers, a desire to not compromise single-core performance, and the enormous risk of a large investment in a long development cycle with no guarantee of success.
I believe there is a huge disruption opportunity here: an effort focused on building a new parallel browser, with no interest in single-core performance, which can rethink all engineering choices, could start by filling a niche and eventually dominate. It could also flame out spectacularly! It would be great territory for a startup, if VCs weren't obsessed with creating the next great social networking site. It's good territory for research --- Ras is on the money. I do hope that if such a disruptive effort wins, the code is open source. I'm not sure what can be done to help ensure that.