Eyes Above The Waves

Robert O'Callahan. Christian. Repatriate Kiwi. Hacker.

Wednesday 2 February 2011

Distinguishing "Embeddable" Versus "Readable" Web Resources Considered Harmful

Sorry for the "considered harmful" title, but I haven't had time to think of a better one!

The Web platform distinguishes between "embeddable" and "readable" resources. Traditionally cross-origin subresource loads, such as image loads, produce a resource that is "embeddable" (e.g. rendered) but not "readable" (i.e. exposing the conents of the resource to the containing page).

However, the distinction between "embedding" and "reading" is arbitrary, because "embedding" almost always leaks some information about the resource. For example, cross-origin image loads leak the size of the image and whether it exists. When we design new Web APIs we have to decide whether they should be allowed for embeddable-but-not-readable resources, and if they are allowed, whether any information leak is tolerable --- or else we design some mechanism to control the spread of the information leak. For example, the HTML5 canvas spec allows cross-origin images to be drawn to a canvas, but prevents the ultimate leak of pixel data by "tainting" the canvas so that getImageData no longer works.

Unfortunately, this design process has problems. Sometimes combinations of features extend the spread of information into a full leak; for example, by combining CSS pointer-events, SVG filters, IFRAMEs, and DOM APIs you might be able to extract image pixel data cross-origin.

Every time we declare that some leak of information is "tolerable", we implicitly create strange requirements for Web authors, for example "secret information must not be correlated with the sizes of any images at guessable URLs in your intranet". As another example, if we decide that fonts are innocuous and create an API that gives Web authors full access to the contents of a cross-origin-loaded font, we implicitly declare that for all time, fonts will not and must not contain origin-private data. Such decisions seem dangerous to me; they're bets against serendipity and the ingenuity of attackers.

Needless to say, every time we create restrictions or workarounds for embeddable-but-not-readable resources, we add complexity for Web authors, who, for example, may be surprised when canvas's getImageData fails due to an invisible taint bit, quite removed from the drawImage that set the taint bit.

Worst of all, it seems to me that this distinction between "embedding" and "reading" has very limited usefulness beyond compatibility with legacy requirements. What are use-cases where an author wants their resources to be embeddable cross-origin, but doesn't want other sites to be able to read the contents of the resources? The only case I can think of is "IFRAME widgets" that contain user data private to a different origin to the enclosing page. In those cases the IFRAME needs to be embeddable but retain its integrity against intrusion by the enclosing page. This use-case is actually not well-served by the IFRAME API, which suffers from serious deficiencies such as clickjacking. I don't know of any use-cases for an embedding vs reading distinction with other APIs.

I don't think the distinction between "embedding" and "reading" was a conscious design decision. It seems that when <img>, <script> and <style> first appeared, cross-origin loads were allowed because no-one had thought of the problems that would arise. Then when APIs emerged that allowed reading the full resource data (e.g. XHR and getImageData), we realized we couldn't allow that cross-origin, but we couldn't break compatibility with cross-origin embedding, so we were forced to distinguish between embedding and reading.

We're saddled with this distinction for existing APIs. However, I think we shouldn't perpetrate it in new APIs (or any API where the compatibility burden of discarding it is low). I think in new Web APIs we should refuse to distinguish between embeddable and readable resources unless there are significant use-cases that require it. In practice, since readability usually needs to default be same-origin only, we will usually need to default to denying cross-origin embedding, and let sites opt-in to cross-origin reading and embedding using CORS or some other mechanism.

People I respect disagree, on the grounds that allowing cross-origin embedding for new resource types is more consistent with existing practice and author expectations. Their point is valid but I think that consistency is outweighed by the problems above.


Danny Moules
I agree broadly with the points you make, especially this:
"Such decisions seem dangerous to me; they're bets against serendipity and the ingenuity of attackers."
The assumption that because these are edges cases they can be ignored is incorrect and potentially dangerous, if not terribly confusing. This is true of all API development.
When moving into the future we need to make the best choices we can. If we opted to bow to IE6's interpretation of specifications, in light of the hardship of rejecting it, where would we be now...
Erik Dahlström
I think you make some very good points here, these things need to be taken into consideration when designing/extending web standards.
For people not on the www-svg mailinglist I'd like to point out that SVG filters are meant to have no affect on pointer-events processing whatsoever.
Anne van Kesteren
I posted a reply of sorts: http://annevankesteren.nl/2011/02/web-platform-consistency
Ian Hickson
I think the consistency argument has to be given a lot of weight, because otherwise each generation of Web standards people will bring with it a whole new set of API styles, and we'll end up with a platform that is nigh on impossible to intuitively understand. We're already there in many ways, in fact.
I don't know which argument is strongest in this particular case. Luckily, it's not my problem for once. :-)
Robert O'Callahan
With the current approach we're adding complexity to the platform anyway, as we discover and work around the information leaks from embedded-but-not-readable resources. :-(
A good example of "embedding" and "reading" confusion:
Treponemiatic Yaqui
Have you considered making your text a single, easily readable column which your users can control the width of? It's insane only having three words per line.
There is a reason why no other site on the internet has adopted your layout.
Could someone please tell me -- when I read an image from an IP camera and put that .jpg into a canvas...is there a way to read the pixels using getImageData()? Is the reason it fails because of this cross-origin problem? I barely understand this thread -- but I do have the job of interpreting the images coming from this camera. Can someone PLEEEEZE suggest a way to do it? Thx.