I've fixed a number of bugs in the new textframe code. It's now very usable; I've been dogfooding it most of this week and some other people have too. I think we're on track for turning it on in the next few weeks. Set MOZ_ENABLE_NEW_TEXT_FRAME = 1 in layout/generic/Makefile.in for a good time. There are still some problems; for example, word selection is a little bit broken.
Much of this week I was looking at white-space handling. I've been cleaning up tricky issues like where we allow line breaks in white-space:pre-wrap text. A big thing in the last two days was handling pre/pre-wrap tabs; our existing code expanded them to a number of space characters based on the number of characters present in the line. CSS 2.1 wants us to advance to tab stops instead, which works better with proportional fonts. I've implemented that, among other fixes, and we should be very compliant with CSS 2.1 white-space now.
One remaining issue is that currently our white-space collapsing pass uses only backward context; for example we collapse a space that has a space before it. We can't however collapse spaces that are followed by, say, a line feed, as support for white-space:pre-line would require; that requires knowledge of the forward context. Similarly we should translate a newline between two CJK characters to a zero-width space, but that also requires forward context in general. (Right now we do this only when both CJK characters are in the same text node.) Backwards context is easy to implement because you can use a little state machine as you traverse the DOM or text frame tree or whatever. Supporting both directions requires something significantly more elaborate. I haven't decided whether it's worth doing it right now. Now would be the easiest time to do it.
I still need to do some performance tests on all three platforms to see what the effect is. Now all the pieces are in place ... in particular the new textrun cache is in. Even with the old text frame that cache sped up Tinderbox pageload performance by 3-10%. Some of that speedup was because the old textrun cache did a lot of PR_IntervalNow calls, and get-current-time calls are particularly slow in VMs because they'll trap to the virtual-machine monitor. But there was real speedup there too.
Vlad and Pav are arriving tomorrow to spend a couple of weeks here. This'll be exciting... First we're going to have to get some more furniture :-). I hope the weather's good! You never know in Auckland :-).
Hiring's been disappointing lately. If anyone reading this knows anyone in NZ who'd be good to work on Firefox, let me know!!!
I switched from Parallels to VMWare Fusion beta 3 because VMWare's support for Linux is so much better. Fusion's working great so far ... except that Windows builds are even slower than they were in Parallels, which was already slow. I just discovered that our "mkdepend.exe" tool that crawls C/C++ code to find the #include dependencies on Windows is incredibly slow and actually eating most of the build time. It looks like it's doing tons of stat() calls that are really slow, perhaps because they're often going to VMWare's virtual filesystem. I plan to take a quick look to see if there's some easy optimization that would help.
On the Chronicle front, there's a steady stream of people downloading the tarball. It's up to 161 downloads; not bad for something that doesn't really do anything useful. In snatches of time at airports and at home, I've gradually been putting an Eclipse UI together; it's actually pretty easy. Writing Java code in Eclipse again is fun.
People were quite interested in Chronicle when I talked about it at the OSQ retreat and before that, with the people gathered for the OOSPLA committee meeting. At the OSQ retreat I was amused to find myself an "industry representative" --- in fact, as the primary "open source" representative. It is nominally an "Open Source Quality" meeting, after all :-). I pushed the message that although all the research on bug finding is great, the truth is that in our domain we have already found plenty of bugs, and our biggest problem is fixing them. I suggested that therefore we need more research on topics such as debugging and performance analysis. I also suggested that the worst code is not necessarily buggy code, but code that is unnecessarily complex. Detecting that would be an interesting new direction for program analysis.
There was a great deal of discussion of parallel programming, now that we have entered the multicore world. More than one person opined that multicore is a mistake we will live to regret --- "we don't know what the right way is to use the transistors, but multicore is definitely wrong". There was general agreement that the state of parallel programming models, languages and tools remains pathetic for general-purpose single-user programs and no breakthrough should be expected. My position is that for regular desktop software to scale to 32 cores by 2011 (as roadmaps predict) we'd have to rewrite everything above the kernel, starting today, using some parallel programming model that doesn't suck. Since that model doesn't exist, it's already too late. Probably we will scale out to a handful of cores, with some opportunistic task or data parallelism, and then hit Amdahl's law, hard. It is probably therefore more fruitful to focus on new kinds of applications which we think we have reasonable ideas for parallelizing. I think virtual worlds (which are not just "games", people!) are a prime candidate. That's a direction I think the PL/software engineering research community should be pushing in.