Monday 11 May 2020
Why Forking HTML Into A Static Language Doesn't Make Sense
Quite often someone proposes something like this:
Rather than reply directly, I decided to write this blog post so I can refer to it again next time this comes up.HTML should never have grown into the mutated application runtime it is today. The presentational concerns for documents are different from application rendering. The javascript stack should have been something entirely separate.
I strongly feel we should create a lightweight HTML fork that is again document-centric and doesn't allow for all of this javascript nonsense. Something that doesn't allow for stupid custom UI or behavioural tracking. Just text, images, videos, and links. The painting algorithm would be dead simple, documents would load lightning fast, and we'd be confident there would be no ad malware.
There are incorrect factual assumptions here. "The presentational concerns for documents are different from application rendering" is incorrect; there's a lot of overlap, and there is a continuum between static documents and interactive applications, with use-cases all the way along that continuum. "Something that doesn't allow for stupid custom UI or behavioural tracking. Just text, images, videos, and links" is contradictory; embedded images, videos and links allow for lots of tracking and custom UI.
A bigger issue is that defining a static fork of HTML is easy, but persuading Web developers to use it is where the real problem lies, and that is immensely difficult and no-one has any good ideas for how to do it. Of course, if you do find a way to persuade Web developers to avoid features you don't like, there isn't much value in defining a specific "static subset" of HTML.
By the way, I just partially lied. Defining a static fork of HTML is easy, but I'm sure everyone who thinks the modern Web has gone off the rails has a slightly different list of features that should be allowed, so getting your entire constituency to agree on a specific list of allowed features is going to be very difficult. (E.g., are forms in that subset? contenteditable? Video? Animated GIFs? CSS :hover? CSS media queries? Different people will give different answers to these questions.)
Some people believe that defining a static subset of HTML would benefit Web developers and users because we could build a much more efficient browser (or browser mode) for that subset. As a former Mozilla distinguished engineer, I'm skeptical. Even something as fundamental as window size changes requires the ability to do incremental relayout. Any kind of editing or forms requires support for DOM changes, as does rendering of incrementally loaded documents. You could take a few shortcuts but you won't easily beat the performance of existing browsers on the same content, especially keeping in mind the incredibly impressive optimizations those browsers have already implemented.
The only possible benefit I can see of a defined static HTML subset is that for documents opting into that subset, browsers might be able to put all those documents into a single sandboxed child process instead of splitting them into a different sandboxed process per site, on the assumption that these documents are a negligible security risk to each other. I don't think this is at the forefront of the minds of advocates for a static HTML subset.