Tuesday 27 December 2011
Jeff Walden is doing awesome work on the foundations of Gecko to unify the lowest-level infrastructure used by JS and the rest of Gecko and to rebase it on modern C/C++ standards. But there is a controversy about whether non-fatal assertions should be part of that infrastructure. This issue came up before, not long ago.
I strongly believe that non-fatal assertions are valuable when used to report the presence of a bug that is not as severe as a browser crash. An example I just pulled out of nsBlockFrame::Reflow:
ReflowBullet(state, metrics, lineTop); NS_ASSERTION(!BulletIsEmpty() || metrics.height == 0, "empty bullet took up space");
If this assertion fails, then we have detected a Gecko bug which should be reported and eventually fixed. If the assertion failure is a regression, and we detect it in time, we will try very hard to fix it before the patch ships in a browser release. (If the regression is triggered by our layout reftest test suite, we will almost certainly detect it on checkin since reftests go orange on new non-fatal assertion failures.) However, if this assertion failure is the only thing that goes wrong, there will almost certainly be no ill-effects beyond the page having a slightly incorrect layout --- maybe. Some pages might trigger the assertion but appear to render correctly. Even if the bug causes a detectable test failure, the assertion helps to narrow down the cause and understand the code.
Using a fatal assertion here would have the same benefits but additional costs. A test run hitting the assertion would abort the suite, meaning we lose the results for the rest of the tests in the suite. This makes fixing test failures slower and more painful than necessary since more runs of the suite will often be required. (It's similar to a compiler always aborting after the first syntax error in a compilation unit.) Another cost is that if you are using a debug build for some reason and you hit this bug while trying to work on something else, your work will be unnecessarily blocked.
At this point some will say "Ah! But assertion failures should always just be fixed. Since non-fatal assertions are more ignorable, they encourage you to leave bugs unfixed."
That statement ignores the reality of bug priorities. An assertion failure is just a bug and needs to be prioritized along with other bugs. If our assertion infrastructure and associated project rules forced us to always prioritize some rare list bullet spacing bug above all bugs that don't trigger assertions, then that infrastructure would be actively damaging our project. We would have to respond by simply removing a lot of our assertions and losing their benefits. It is crucial to be able to ignore unimportant bugs.
Having said all that, fatal assertions certainly have their place. I imagine that in the JS engine almost any bug will lead to a crash sooner or later, so it makes sense for fatal assertions to prevail there because the downside is minimal; you were going to crash anyway, and crashing later might just be confusing.
Addendum: an objection to non-fatal assertions is that libc "assert" and some other assertion mechanisms are fatal, so to call something non-fatal an "assertion" is confusing. That may be true for some people, but it's cultural. The culture in Gecko is that NS_ASSERTION is non-fatal. Maybe renaming our non-fatal assertion mechanism would end up being a net win by some point in the future, but I'm dubious. Although I wouldn't actually mind too much; I just want non-fatal assertions, whatever we call them.