Eyes Above The Waves

Robert O'Callahan. Christian. Repatriate Kiwi. Hacker.

Wednesday 14 February 2007

Superlaser Targeting Cupertino

I think I found a nasty bug in ATSUI.

It seems to have a problem with bidi overrides and trailing whitespace. If I give it the string RLO space space PDF (namely, two spaces inside a "right to left override" Unicode control), ATSUI produces glyphs with the first space to the left of the second space. This is wrong.

The general problem seems to be that bidi overrides are ignored for any trailing whitespace in the layout. You can even see the problem with a left-to-right override, for example when the whitespace follows a Hebrew character.

So far I haven't found any instances of this bug affecting non-whitespace characters, but it's possible I just can't see such effects, not being able to read any RTL langauges.

I'm able to work around this problem, by detecting when glyphs aren't in the order I expect and processing them in the correct order, but it seems like a pretty major bug. Unfortunately there doesn't seem to be much information on the Web about using ATSUI with bidi overrides. Most of the hits for my searches are actually Webkit checkins. (Webkit seems to use the same trick I'm using with ATSUI and Pango to force all characters to follow a certain direction: insert a RLO/LRO header character and a trailing PDF character into the text before handing it to the text engine.)


Ludovic Hirlimann
Did you log that issue at Apple ?
I'm not convinced this is a bug, since rule X9 of the Bidi Algorithm will remove the RLO and PDF characters from the stream and rule L1 will set the level of any trailing whitespace to the paragraph level. This would explain why trailing whitespace on a line of Hebrew characters ends up RTL, anyway.
I think Ned is right about trailing whitespace. Rule L1... so that the whitespace isn't trailing.
Robert O'Callahan
ATSUI seems correct --- it's the Unicode bidi algorithm that's making trouble for me! I've just submitted a fix in bug 375662 that works around the issue in a cleaner way.