Thursday, 21 September 2006

Here Comes A New Challenger!

A friend pointed me at a very interesting article about the New York Times Reader, which is based on the Windows Presentation Foundation. You should also read The Times' FAQ. It fulfills my expectations that I blogged about before: a classy application that takes eyeballs away from the Web and back to a world fully under Microsoft's control.

Obviously we need to make it possible to do the same things in a Web page. As far as I can tell there are three sets of issues: offline access and layout. The rest of Reader's features could be implemented using existing Web technology, as an AJAX application (although XUL would help).

I believe the offline access issues are actually easier to solve; we're shipping part of a solution in Firefox 2 (DOM Storage). The other part of a solution is the ability to fetch and pin pages in the browser cache so sites can ensure they'll be available offline. I have a proposal for that. It needs some work but doesn't seem to be a huge task to specify and implement.

The other issues are layout and rendering. Clearly there's a big emphasis on columns; fortunately Firefox does columns. Unfortunately our columns don't make it easy to create the sort of page-based UI I see in the screenshots. I think a possible approach would be to use XUL box layout to create the header/body/footer vertical arrangement (via CSS, no XUL elements), and make the body overflow and scroll to the right. Then modify the Gecko event code so that the pageup, pagedown, space, and arrow keys do the right thing (namely, scroll left by a page, scroll right by a page, scroll right by a page, and scroll by columns). This would require just a few code tweaks and no new style properties.

The page navigation bar in the footer could be easily implemented using script, onresize, and onscroll handlers, but I don't know if it's really useful.

The hardest issue that I can see in the Mississippi screenshot, where the image is on the right-hand side of the first page and spans columns. One way to achieve that here would be to allow column content to flow around floats outside the column flow. Then the article title and the image could be placed in left and right floats respectively and the column content flowing around them would produce the right effect. We could also add the ability for content in columns to flow around overflowing content from previous columns, which would let people put column-spanning objects in the column flow.

That still isn't enough to produce some common effects, such as figures at the top or bottom of a page. For those we will need CSS extensions. One possibility is to extend "float" with "column-top", "page-top", "column-bottom", and "page-bottom" values.

Another issue mentioned in the article is fonts and typography. I can't tell much from the screenshots but they don't appear to be using widow/orphan control, so it's probably about font rendering. We do desperately need a solution to font downloading. On the bright side, our current Gecko work will let us improve our story with kerning, ligatures and other font rendering issues.

Even if we can implement all these things, there's still the issue of how we get people to use them in their sites. Microsoft is very good at partnering with major companies to produce showcase applications like this NYT Reader. We need partnerships like that. You would think that certain major Web sites would be interested in pushing back...

Reading between the lines, though, there is a lot of hope for the Web. The application is a 20-minute download for XP users without WPF installed, and it apparently requires a trust decision from the user, so it's not for casual surfers. You will need to leave the Reader application to follow links to other sites or content that hasn't been reformatted for the Reader yet, and links to NYT stories from other sites probably won't open in Reader. The Web's hypherlinked nature give it a lot of gravity and it's hard to see that changing in the medium term. The NYT had to develop their own application with Microsoft's help and has to deal with its content being served out in a different way. If we can produce an experience comparable to Reader in the browser --- and especially if we can make it available as a set of easy, downward-compatible changes to existing sites --- I think we'll have an overall much more compelling user experience.


Wednesday, 20 September 2006

Status

I've been working my way through a backlog of stuff. Most noticeably, to some anyway, I've caught up on my reviews. I currently know of no outstanding review requests where the ball is still in my court. If I've missed something, let me know.

Lately I've been working on the new textframe code in bug 333659, that reorganizes text layout around the new gfxTextRun abstraction. I've written all the new layout parts, almost completely replacing nsTextTransform.cpp and nsTextFrame.cpp. There's still a lot of work to do to implement gfxTextRun on all platforms. I'm starting first with the Pango-based implementation, but that's stalled while I wait for masayuki's rewrite of our existing Pango code to land. The new code feels nice to me. It handles combining characters and clusters everywhere and seems a lot simpler than the old code. Hopefully it will perform better too, now that we can retain glyph conversion and placement between layout and painting. Only testing can tell.

The dependencies surrounding this code are quite complex at the moment. The new textframe depends on changes to inline layout that I have filed as a separate patch. But the inline layout changes actually break the computation of minimum width for text containing words of more than one element (e.g., <b>H</b>ello). Fixing that really needs the reflow branch. Turns out that the reflow branch is also broken in the same way and fixing it could use these inline layout changes! On top of all that, the new textframe code is really needed to help improve performance with cairo, so the complete cairo switchover needs this code. Sigh...

Timeless prodded me to work on scrolling performance, which has regressed on trunk a bit, especially for certain sites such as Gmail. We found a few issues and I have patches to fix those that I should check in tomorrow. They're all to do with "scroll analysis", how we determine when and where it is safe to scroll a window using "bitblit" (just copying pixels from one place to another on the screen). A full explanation deserves a blog entry of its own.

Next I think I'll shoot down a few random bugs. Maybe fix the file input control so that the textfield is operable again, but typing into it brings up the filechooser dialog. Soon I plan to dive into the ubrowser patches and try to get it running on Linux and ready to land in Mozilla CVS somewhere.


Friday, 15 September 2006

Static Analysis And Scary Headlines

A few days ago Slashdot trumpeted the headline "611 Defects, 71 Vulnerabilities Found In Firefox", based on a post by Adam Harrison who had applied his company's static code analysis tool to the Firefox code. That's not an unfair summary since Harrison's post says "The analysis resulted in 655 defects and 71 potential security vulnerabilities."

The problem is Klocwork, like most other static analysis tools, reports false positives; i.e., it reports problems that are not actually bugs in the code. (More precisely, it may identify error behaviours that actually cannot occur in any run of the program.) That itself is not a problem, but when reporting the results of these tools you must make clear how many error reports the tool produced and how many of those have been verified by humans as corresponding to actual bugs which would affect some run of the program. In this case, it was not clear at all. We're told "655 defects", and then in comments Harrison claims "In this particular analysis we reviewed the entire results to verify the correctness of the defects." But Mozilla developers have been combing through the Klocwork reports and it turns out that most of them are not real bugs.

Here's one example:

902         compNameInFile = (char *) malloc(sizeof(char) * (stbuf.st_size + 1));
...
908 memset(compNameInFile, 0 , (stbuf.st_size + 1));
...
911 rv = fread((void *) compNameInFile, sizeof(char),
912 stbuf.st_size, dlMarkerFD);
...
923 if (strcmp(currComp->GetArchive(), compNameInFile) == 0)

The Klocwork analyzer claims that on line 923, there can be a string overflow reading from compNameInFile because it might not be null terminated. But in fact it's clear that there will always be at least one zero in the buffer.

In fact, people have looked at lot of the Klocwork reports and so far 2-3 of them have been judged genuine bugs (be sure to read the comments in those bugs...).

I'm sympathetic to static analysis tools; I did my PhD in the area :-). I really want Klocwork, Coverity, Oink and the rest to be successful, to become a standard part of the software development toolchain. But we've got to be honest and avoid the scare headlines.




It seems that even today's best general-purpose static analysis tools have a hard time finding high-value bugs in our code. We're having much more success with testing tools. I hypothesize that our interesting bugs are due to the violation of complex invariants that are not tracked by these general-purpose tools. These are invariants such as "at most one of mFoo and mBar are non-null", or "if frame F has a child C, then C's mParent points to F". Inferring these invariants automatically or proving things about them from the code is very hard, not least because they are often violated temporarily. But tools that don't understand these invariants are likely to always have a high level of false positives in our code.

I think hope lies in a few different directions. Tools like Klocwork can be refined to home in on clear errors and throw away the chaff. There are more model-checking-like tools --- really, tools for doing intelligent, directed, systematic testing, incorporating knowledge of the code --- that can find complex error scenarios and guarantee no false positives. And I would like to see frameworks like Oink mature to make it easy to write custom analyses for specific problems in certain applications, probably aided by annotations. I have several ideas for custom analyses in Mozilla.


Monday, 11 September 2006

Dream Time II

Last night I dreamed about a bug in my new text code ... I'd forgotten to call FlushSpacingCache after adding justification space between reflowing and painting the text.

This morning I remembered, and fixed it.

I don't know if this is a good thing or a bad thing.


Monday, 4 September 2006

Crazy Noodle

There's a lot of good food in Auckland but I find myself repeatedly drawn back to Crazy Noodle in Newmarket. It's relatively cheap, the ambience is pleasant, uncrowded and child-friendly, the food is wonderful, and there's great variety. The Hong Kong-style Western food isn't altogether to my taste, but it's fun to try. Their standard dishes such as beef chow fun and Singapore fried rice noodle are --- wonder of wonders --- not greasy. I believe their "vanilla sago milkshake" is the best milkshake I've ever had; it's difficult to have just one.

The place is also a brisk 30-minute walk from my house, which is nice on a pleasant evening when I have time to spare and no children.