Eyes Above The Waves

Robert O'Callahan. Christian. Repatriate Kiwi. Hacker.

Monday 4 January 2016

innerText: Cleaning A Dark Corner Of the Web

One of the things I did last year was implement innerText in Gecko and write a specification for it. This is more or less a textbook case of cleaning a dark corner of the Web.

innerText was implemented in IE 5.5 (or maybe earlier?), apparently with the goal of approximating the getting and setting of the "rendered text contents" of a DOM node. Microsoft being Microsoft at the time, there was no attempt to add it to any Web specification. Some time later, Webkit implemented their own version of it --- a very different and quite incompatible implementation. Naturally, Blink inherited the Webkit implementation. Over the years implementations evolved independently.

In Gecko we didn't bother to implement it. It was seldom used, and you can quite easily polyfill a decent implementation, which has the advantage of working the same across all browsers. (Many users can just use textContent instead.) It's a feature the Web just doesn't need, even if it worked interoperably and had a spec.

Sadly, this year it became clear we have to implement it. People have been using it in mobile sites (Webkit monoculture) and it even started showing up on the odd desktop site where people didn't test with Firefox. I ran into this on xkcd.com and then it was clear I had to fix it!

I could have done what Webkit/Blink (and I suspect IE) did and hooked it up to our existing plaintext serializer (which implements clipboard plaintext copy), adding a third totally incompatible implementation ... but that's not Mozilla. So I wrote a bunch of testcases (partly inspired by kangax's) and studied some other resources, and created a spec that felt to me like a good tradeoff between simplicity and Web compatibility --- and implemented it in Gecko. As kangax notes, it's highly Blink-compatible but avoids some obvious Blink bugs.

My key insight while writing the getter's spec was that we should reuse the CSS spec by reference as much as possible instead of specifying rules that would duplicate CSS logic. For example, my spec (and implementation) don't mention CSS white-space property values explicitly, but instead incorporates CSS white-space processing by reference, which means all the values (and any new ones) just work.

So far so good --- our implementation is riding the trains in Firefox 45 --- but this is not the end of the story. The next step is for other browsers to change their behavior to converge on this spec --- or if they can't, to explain why so we can fix the spec so they can. To be honest, I'm not very optimistic about that since I've received minimal feedback from Apple/Google/Microsoft so far (despite begging). But we've done what we can at Mozilla to fix this wart on the Web, and we've probably done enough to fix innerText-specific Web compat problems in Firefox for the forseeable future.

Comments

Peter Kasting
Are there Chromium bugs on compliance here? If so I can try to help push those.
Robert
https://code.google.com/p/chromium/issues/detail?id=573309
Anonymous
I have never used innerText directly. Not sure if any of the frameworks that I've used rely on it like jQuery, Angular or React.