Friday 10 February 2006
The Joy Of Text
nsTextFrame is the core of our text layout and rendering. To the rest of the layout engine, it exposes an interface pretty much like that of any other inline frame. It consumes fairly simple platform-specific APIs for measuring and drawing strings of single-style text. It implements a whole lot of behaviour:
All this is really complex for a few reasons. For starters, the functionality itself is complex, especially when you take into account weird languages and scripts: e.g., the German "ß" whose uppercase form is two characters, "SS", so uppercasing a string can actually change its length; various RTL issues; UTF16 surrogate pairs; and glyph clustering (when multiple Unicode code points combine to form an atomic cluster of glyphs). Furthermore we support a mix of platforms with different underlying capabilities, so some nsTextFrame code is only used on platforms that don't have a required capability. And of course text layout is a core part of Web rendering, and hence nsTextFrame is performance critical, so we have convoluted code to avoid copying text and to speed things up in other ways.
Unsurprisingly nsTextFrame has become very messy. Apart from the usual difficulties this causes us with bugfixing and performance, we also need to extend it to support text-overflow, hyphenation and other features. Furthermore, over the last several years the multilingual, multiscript text layout capabilities of our platforms have improved enormously, with the introduction and widespread availability of Uniscribe, Pango, and ATSUI. Now with the move to cairo-based rendering, it is time for a wholesale redesign of nsTextFrame.
The primary goal is to separate text handling into two abstractions, nsTextFrame and gfxTextRun. A gfxTextRun is a run of single-direction, single-style, single-language text. gfxTextRun is responsible for converting the text into a sequence of glyph clusters and rendering them. The implementation will be platform-specific and rely on platform APIs as much as possible. A new nsTextFrame implementation is being written that implements all the functionality of text frames on top of gfxTextRun. Based on our experiences with the current nsTextFrame, I have designed a gfxTextRun interface which is fairly clean and around which we can hopefully build a much leaner, cleaner implementation of text frames. The actual implementation will build the real interface and implementation incrementally, and no doubt diverge a bit from the proposal, but I believe it's a good idea to think far enough ahead to have confidence we're heading to a coherent end point.
Currently Red Hat ships Firefox builds configured to use Pango underneath the existing nsTextFrame. These builds have notoriously poor performance. In SUSE we're going to enable these builds only in certain Indic locales likely to use scripts that the non-Pango builds are completely incapable of. In the new world I hope we can achieve much better performance, possibly even better performance than non-Pango builds today, because we'll be able to construct a gfxTextRun for an entire single-style text paragraph, eating the cost of glyph conversion and placement just once and then sharing it among all the text frames for that paragraph.
One interesting detail is that I think we can move text-transform and smallcaps out of text frames into specialized gfxTextRun implementations, treating them as nothing more than an extra processing pass during text to glyph conversion. This will help simplify nsTextFrame even more. Unfortunately nsTextFrame will still have to do whitespace compression as this depends on information from surrounding frames.
One implication of this plan is that we will not be fixing many bugs in the existing trunk nsTextFrame, unless they are candidates for landing on a FF1.5 or FF2 branch.
- Regular text measurement, layout, linebreaking and painting
- XML/HTML whitespace compression (for CSS "whitespace" not "pre")
- Text selection painting
- IME-specific selection/conversion painting
- CSS "word-spacing" and "letter-spacing"
- CSS "text-decoration" in "quirks mode" (including "text-decoration:blink")
- CSS "text-transform"
- CSS "font-variant:smallcaps"
- CSS "text-align: justify" (in conjunction with control logic in nsLineLayout)
All this is really complex for a few reasons. For starters, the functionality itself is complex, especially when you take into account weird languages and scripts: e.g., the German "ß" whose uppercase form is two characters, "SS", so uppercasing a string can actually change its length; various RTL issues; UTF16 surrogate pairs; and glyph clustering (when multiple Unicode code points combine to form an atomic cluster of glyphs). Furthermore we support a mix of platforms with different underlying capabilities, so some nsTextFrame code is only used on platforms that don't have a required capability. And of course text layout is a core part of Web rendering, and hence nsTextFrame is performance critical, so we have convoluted code to avoid copying text and to speed things up in other ways.
Unsurprisingly nsTextFrame has become very messy. Apart from the usual difficulties this causes us with bugfixing and performance, we also need to extend it to support text-overflow, hyphenation and other features. Furthermore, over the last several years the multilingual, multiscript text layout capabilities of our platforms have improved enormously, with the introduction and widespread availability of Uniscribe, Pango, and ATSUI. Now with the move to cairo-based rendering, it is time for a wholesale redesign of nsTextFrame.
The primary goal is to separate text handling into two abstractions, nsTextFrame and gfxTextRun. A gfxTextRun is a run of single-direction, single-style, single-language text. gfxTextRun is responsible for converting the text into a sequence of glyph clusters and rendering them. The implementation will be platform-specific and rely on platform APIs as much as possible. A new nsTextFrame implementation is being written that implements all the functionality of text frames on top of gfxTextRun. Based on our experiences with the current nsTextFrame, I have designed a gfxTextRun interface which is fairly clean and around which we can hopefully build a much leaner, cleaner implementation of text frames. The actual implementation will build the real interface and implementation incrementally, and no doubt diverge a bit from the proposal, but I believe it's a good idea to think far enough ahead to have confidence we're heading to a coherent end point.
Currently Red Hat ships Firefox builds configured to use Pango underneath the existing nsTextFrame. These builds have notoriously poor performance. In SUSE we're going to enable these builds only in certain Indic locales likely to use scripts that the non-Pango builds are completely incapable of. In the new world I hope we can achieve much better performance, possibly even better performance than non-Pango builds today, because we'll be able to construct a gfxTextRun for an entire single-style text paragraph, eating the cost of glyph conversion and placement just once and then sharing it among all the text frames for that paragraph.
One interesting detail is that I think we can move text-transform and smallcaps out of text frames into specialized gfxTextRun implementations, treating them as nothing more than an extra processing pass during text to glyph conversion. This will help simplify nsTextFrame even more. Unfortunately nsTextFrame will still have to do whitespace compression as this depends on information from surrounding frames.
One implication of this plan is that we will not be fixing many bugs in the existing trunk nsTextFrame, unless they are candidates for landing on a FF1.5 or FF2 branch.
Comments
Maybe even as a follow up to that post in .i18n.
1. What perf impact would there be on Windows? or Mac OS X? Would there be any gain, or is this mainly for the benefit of Pango based builds? Does this improve Cairo builds performance as well?
2. Are there any major bugs in the current implementation that this would resolve? Or is the big benefit that the code will be easier to manage in the future?
3. Is there any printing impact?
> plays into this?
I believe Thai line-breaking is tough, and we'll still be doing line-breaking in our frame code, so that will still be our problem. I believe Thai also requires special glyph positioning, shaping and/or clustering, and that will be handled by the underlying platform (Pango, Uniscribe, ATSUI).
> So is the implication for gfxTextRun in the Mac
> world that ATSUI will get to handle
> paragraph-sized chunks of text?
Yes and no. We will only passing single-style chunks of text to ATSUI, so if you use inside your paragraph, ATSUI will only see one piece of the paragraph at a time. Also, we won't be using ATSUI for line breaking, ATSUI will just be laying it out as a really long line that we'll then chop up.
> 1. What perf impact would there be on Windows?
> or Mac OS X? Would there be any gain, or is
> this mainly for the benefit of Pango based
> builds? Does this improve Cairo builds
> performance as well?
It's dangerous to speculate in advance of the facts. Performance is certainly a paramount concern. I don't know how well Uniscribe or ATSUI perform. If they suck, then we can always implement faster paths in the platform gfxTextRun implementation (e.g., do something simple and fast for ASCII). One thing I perhaps didn't make clear is that this is all for cairo builds *only*.
> 2. Are there any major bugs in the current
> implementation that this would resolve? Or is
> the big benefit that the code will be easier to
> manage in the future?
Our current code completely sucks for many Indic languages, and this will fix that (as long as the platform handles such languages well; Pango does). It will probably fix a lot of other I18N bugs where the underlying platform does better than us; no doubt there will be some unfixable regressions where the underlying platform does worse (switch platforms!).
It will also make it much easier to fix "bugs" like "Gecko doesn't handle HTML soft hyphens".
> 3. Is there any printing impact?
Not particularly, unless the underlying platform APIs have problems with printer font metrics. One would hope they don't.
1. How does gfxTextRun relate to, e.g., CSS "writing-mode:tb"?
2. Do layout changes (e.g., on window resize) lead to a complete recreation of the gfxTextRun objects in a nsTextFrame?
BTW, if there's any documentation on nsTextFrame as it is or any documentation-to-emerge on how it will be, I'd be grateful for a pointer to it.
Shouldn't "�".toUpperCase(); return "SS" then?
> "writing-mode:tb"?
It can be used by "writing-mode:tb", but writing-mode:tb is not part of this plan.
> 2. Do layout changes (e.g., on window resize)
> lead to a complete recreation of the gfxTextRun
> objects in a nsTextFrame?
No, that won't change any gfxTextRuns.
> BTW, if there's any documentation on nsTextFrame
> as it is or any documentation-to-emerge on how
> it will be, I'd be grateful for a pointer to it.
There isn't much of either.
> Shouldn't "�".toUpperCase(); return "SS" then?
You mean in Java? I suppose it should.
> You mean in Java? I suppose it should.That's what
http://www.fileformat.info/info/unicode/char/00df/ agrees for Java, but FYI in JavaScript Tools > JavaScript Console, paste in
alert("b��".toUpperCase())
You get "B��": same � letter, same length.
> I believe Thai also requires special glyph
> positioning, shaping and/or clustering, and that
> will be handled by the underlying platform
>(Pango, Uniscribe, ATSUI).
> We will only passing single-style chunks of text
> to ATSUI, so if you use inside your paragraph,
> ATSUI will only see one piece of the paragraph
> at a time. Also, we won't be using ATSUI for
> line breaking, ATSUI will just be laying it out
> as a really long line that we'll then chop up.
> It will also make it much easier to fix "bugs"
> like "Gecko doesn't handle HTML soft hyphens".
I conclude that you expect to be able to do the line chopping and therefore hyphens on glyphs, not characters.
You might run into problems and get something not so easy to make work with complex scripts.
But then after all, I'm not sure there any easy alternative. I read OpenType handles glyph and characters in parallel and knows the correspondance, but I don't know if it can be achieved with Uniscribe/ATSUI/Pango.
I cross-posted references this on news://mozilla.dev.i18n
> line chopping and therefore hyphens on glyphs, not
> characters.
Line break positions will be selected using our existing code, which works on characters (well, UTF16...)
Hyphenation will only work where soft-hyphens have already been inserted by the author, or possibly where we have hyphenation dictionaries, so at least initially it will only work automatically on simple languages.
You say first UPA (Uniscribe, Pango, ATSUI) will be responsible for things like glyph positioning, shaping and/or clustering.
And will give as a result extended uncut long lines of text.
Then, the existing code will select line break position.
I don't see how UPA can do that and output characters and not glyphs.
Do you intend to insert possible line break in the string beforehand ?
But that would certainly break the UPA clustering
code.
I just spent some time checking how this is done in Uniscribe.
The conclusion is that UPA will certainly gives out both glyph and the info necessary for the line breaking, and it's not necessary useful to continue using the old code for that.
Based on the info on those pages :
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/uniscrib_9t2d.asp
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/uniscrib_7yhx.asp
(especially the description of cluster array/character array/glyph array in remarks of ScriptShape)
> string beforehand ?
> But that would certainly break the UPA clustering
> code.
Only if we decide to break between characters that UPA wants to form a glyph cluster. That would be a bug in our line breaking code.
It might make sense to eventually get line-breaking information from UPA. However, ultimate control has to stay with nsTextFrame so we can do hyphenation, CSS 'whitespace', and other effects. Given that, the simplest thing to do for now, while we get everything else working, is to keep on getting line-break information the way we currently do.
Do you have any plan to revise the interface/implementation?
It's a hard to implement the interface efficiently in UPA. Especially BreakInBetween() which recieves two unconnected strings while UPA breakers accept only one string. You have to copy the two strings to a newly allocated buffer everytime the function is called.
> that UPA wants to form a glyph cluster. That
> would be a bug in our line breaking code.
It's a bad world. The hyphens points for the french "effectivement" are "ef-fec-ti-ve-ment", but still 'LATIN SMALL LIGATURE FF' (U+FB00) applies to this word.
But the correct method now seems clear to me.
Both Uniscribe and Pango have a word breaking function that works on characters :
pango_break and ScriptBreak.
And both have a methode to link each glyph to the corresponding initial character(s) :
cluster array of ScriptShape for Uniscribe, and gint *log_clusters inside PangoGlyphString as output of pango_shape.
So you generate and measure the glyphs, see before which glyph you need to break, and then get back to the character that corresponds, and finally choose the appropriate break point using a character based breaking function.
You also reshape the string after breaking to handle case like 'ef-fectivement' where the break can occur in the middle of what used to be a glyph.
pango_break and ScriptBreak are only informative, they don't do the break, so they can be used as a direct replacement for what exists in the current line-break algorythm.
And that way, they would bring with them thai word breaking without any developments.
jmdesp:
> It's a bad world. The hyphens points for the
> french "effectivement" are "ef-fec-ti-ve-ment",
> but still 'LATIN SMALL LIGATURE FF' (U+FB00)
> applies to this word.
This isn't really a problem, since it's OK to break and not cluster. The situation we need to avoid is where we break between two characters that *must* cluster in order to get a correct rendering.
What you suggest sounds like a good direction to go, but we're not going there right away.
glyph cluster for arabic did progress, but can not be mixed with formatting like IE can do as shown here http://www.catch22.net/tuts/editor12.asp.
I played with this one for some tests that give really funny results, I hope to put them on line soon :-)
https://bugzilla.mozilla.org/show_bug.cgi?id=331716
https://launchpad.net/distros/ubuntu/+source/firefox/+bug/37828
Do you suppose a quick fix could be released for the above problem, hopefully in time for the Ubuntu 6.06 release on 1 June, rather than having to wait for the grand rearchitecting? The DejaVu font project
http://dejavu.sourceforge.net/
has already taken the rash and regrettable step of removing the standard ligatures (i.e., making them �discretionary�), which is a pity seeing that it isn�t even their bug and that the majority of other fonts will actually also trigger it.
https://bugzilla.mozilla.org/show_bug.cgi?id=331716
https://launchpad.net/distros/ubuntu/+source/firefox/+bug/37828
Do you suppose a quick fix could be released for the above problem, hopefully in time for the Ubuntu 6.06 release on 1 June, rather than having to wait for the grand rearchitecting? The DejaVu font project
http://dejavu.sourceforge.net/
has already taken the rash and regrettable step of removing the standard ligatures (i.e., making them �discretionary�), which is a pity seeing that it isn�t even their bug and that the majority of other fonts will actually also trigger it.
https://bugzilla.mozilla.org/show_bug.cgi?id=331716
But it�s not just a Pango problem either, is it? Earlier today, I was looking at
http://depts.washington.edu/ebmp/bibliography.php
in Firefox on Windows on one of the university�s public computers, and the Devanagari in the first entry was completely broken (no combining at all) when viewed with the default stylesheet (justified text), but rendered correctly when I switched off all stylesheets (left‐aligned text).
Anyway, I am very happy that cleaning up text rendering is now one of the top priorities of the Mozilla project, and wish you much success and speedy progress. Until then, I (and India etc.) will have to live with whatever functionality the bit of a hack from Red Hat provides. Switching off complex‐script rendering, however broken, is not really an option.
Thank you for your work on this!
https://launchpad.net/distros/ubuntu/+source/firefox/+bug/37828/