Sunday, 25 March 2007

Text Text Text

I'm well aware I haven't blogged for a while. Things have been extremely busy. After I checked in code to implement the new textrun abstraction on Mac and Windows, there were some performance and correctness regressions. Bad me! I've fixed the Mac performance regressions and we have a fix in hand for the Windows regressions ... thanks to some heroic wrestling of VTune by Chris Double. Another thing I broke was the display of "missing-glyph boxes" when the user has no font with glyphs for a given character. To fix that I've done a cross-platform implementation of Pango-style "hexboxes" --- when glyphs are missing, we draw a box and inscribe inside it tiny hex characters with the character index we're missing glyphs for. It should be very useful for debugging complex scripts (and for users who've memorized the Unicode character table).

With those fires waning I've spent time working on the new textframe code. The gfx side is working pretty well for now and apart from the outstanding regression fixes, what's on trunk is adequate to support the new textframe. First I polished off a number of bugs triggered by our pageload performance test suite, so I could get some performance data. Then today I fixed a bunch of bugs triggered by our "reftest" layout test suite. All these tests work correctly now as far as I know. All these new-textframe fixes have been landed on trunk so it is now possible to build and test the new-textframe code and expect to be able to actually browse. This just requires you to set MOZ_NEW_TEXTFRAME=1 in layout/generic/Makefile.in. There are still lots of bugs, I'm sure; I know there are serious bugs in caret movement. There are also areas I've barely tested yet such as complex script handling and preformatted tabs.

Performance of new-textframe is interesting. Right now it's a slight slowdown in my Mac tests. However almost all the text-related performance cost is in textrun construction, and I've looked carefully at the textruns that are being created in our test suite. The current trunk code usually creates one textrun per word and caches them; the new textframe creates one textrun per run of text with a uniform font, but doesn't use a cache. The current code creates 28K textruns while the new code creates 21K textruns, but the textruns in the new code are much longer on average. The good news is that 8K of those 21K consist of a single space! Optimizing that will be easy. Furthermore if we cache textruns for use with the new text frame (which can be done quite efficiently), then we should get a 30% hit rate which will further reduce the textrun construction cost. So I think we can quite easily turn new-textframe into a significant net performance win.

Tim Rowley is working on integrating textruns into SVG text. He already has prototype code that works reasonably well. Transitioning SVG from the cairo toy text API to gfxTextRun has immediate benefits --- in particular gfxTextRuns implement font substitution, so non-complex international scripts (e.g., Chinese) will be displayed properly as we fall back to Chinese fonts when the default font does not contain glyphs for Chinese characters. It will take more work to support complex scripts with clusters, though.

One problem we've noticed is that SVG text requires bidi reordering. This means that when you have mixed-direction text (e.g. Hebrew and English in the same line) it's possible for SVG <tspan> and other elements to require "splitting" because a single <tspan> may be rendered as multiple discontiguous pieces. This is really tricky stuff. We've already implemented it for HTML+CSS and I'm not looking forward to implementing it again. According to SVG people it's "OK" for SVG to duplicate this functionality because it's done in a "compatible way" so in theory we can share code. But in practice we already have our bidi code tied into the HTML/XML/CSS code and extracting it for reuse with SVG would not be at all easy, especially given that we must avoid slowing down the existing code paths. There is another way that might work, though: have SVG create hidden HTML-style layouts of its text content, applying the HTML text layout engine including bidi processing. Then SVG could take apart the resulting HTML layout and massage it for display SVG-fashion, even including drawing text along a path. Not easy, but perhaps the best way forward. We'll see.

There are other exciting things going on. Offline apps are moving ahead; we've landed significant offline app support on the trunk so a number of interesting demos are possible with Firefox nightly builds. I think other people are blogging about that. There's good discussion going on regarding a first-class <video> element for HTML, and we are preparing to implement it. I spent quite a bit of time reviewing some cool new drag-and-drop code --- download a Firefox nightly build and try selecting content and then dragging it. And a new contractor started work in our Auckland office on Monday.

Looking forward, OOPSLA submissions closed on Monday so I'll have to spend 2-3 of the next six weeks doing OOPSLA reviews, which will be challenging but fun. There are a lot of really interesting-looking submissions. I'm also working on preparing the Amber debugger code for release. Unfortunately I have to rename everything because there's some BIOS debugging tool already called Amber ... I've chosen the name "Continuum" instead. And in just over two weeks I'll be in Mountain View for nearly a week, for a Mozilla Corporation all-hands meeting. When I get the chance I'll be driving the new textframe forward and fixing textrun-related regressions. It's all on!

Update I'm thinking maybe "Chronicle" for the debugger instead. If anyone knows of a project that's using that name, let me know! There's is some blog-related project with that name, but it looks pretty dead (and pretty far away in subject, anyway).



10 comments:

  1. "Continuum" is also the name of a continuous integration server by the Apache project. I don't know whether that will cause problems, but I guess it's popular enough that you want to avoid using the same name.

    ReplyDelete
  2. Robert O'Callahan26 March 2007 at 03:33

    I'm aware of it, but there are so many projects out there, I'm not confident I can find a good name that doesn't clash with *some* project. As long as it's not in the same area...

    ReplyDelete
  3. Nice info, roc.
    On the trunk nightlies, I don't see MOZ_NEW_TEXTFRAME on in about:buildconfig, but I see bug 372629 ("missing-glyph boxes").

    ReplyDelete
  4. "download a Firefox nightly build and try selecting content and then dragging it" -- doesn't make a difference on Windows, does it?

    ReplyDelete
  5. are there any nightlies with MOZ_NEW_TEXTFRAME=1?
    thanks

    ReplyDelete
  6. Robert O'Callahan27 March 2007 at 01:50

    Nightlies are not being built with MOZ_NEW_TEXTFRAME. You'll need to build your own.
    Dao: not sure what you mean. The new drag code should be working on Windows.

    ReplyDelete
  7. Well, according to the bug ( https://bugzilla.mozilla.org/show_bug.cgi?id=178513 ) it was fixed for Mac. But you probably would know better, since you reviewed the patch in question.
    The bug that handles this for Windows, https://bugzilla.mozilla.org/show_bug.cgi?id=374593 , isn't fixed yet.

    ReplyDelete
  8. Robert O'Callahan27 March 2007 at 03:37

    Oh, you're absolutely right of course. Sorry, it's fixed on Mac and Linux but not yet on Windows.

    ReplyDelete
  9. Pardon my naivete but what was the core motivation behind the TextFrame changes?

    ReplyDelete
  10. Hi,
    Any news about continuum/chronicle? It sounds like a handy tool...

    ReplyDelete