Monday, 7 February 2011

Cape Brett

Over the weekend I went on a tramping trip with a group of fourteen friends and friends-of-friends at Cape Brett. We drove up from Auckland to Russell on Friday night, then on Saturday morning caught a water taxi to the end of the Cape where the DOC hut is, just below the Cape Brett Lighthouse.

After stowing our gear at the hut, most of us walked to Deep Water Cove for a swim. That proved to be a bit more work than it sounds --- it was over two hours walking each way, over arduous up-and-down terrain, in the midday sun! The swim was very much appreciated :-). When we finally got back to the hut, some of us went for another swim off the rocks. Swimming there is tricky because there's no real landing and waves surge in through a channel. You have to plunge in and swim away from the rocks, and stay away. To get out you have to wait for a calm patch and time your approach carefully. But the swim was amazing --- the water is fantastically clear and the setting is magnificent.

We had wonderful food and a great evening, but I found sleeping at the hut rather difficult. I was on a top bunk and it was incredibly hot and stuffy. Around 2:30am I just gave up and went outside to sleep on the grass. The night sky was amazing and surprisingly I wasn't troubled by any mosquitoes, but I had to go back inside when it suddenly got cold around 4am --- if I'd brought a sleeping-bag, I would have happily stayed out there!

On Sunday most of us walked out to Rawhiti while the rest took a water taxi. The first part of the walk is the track to Deep Water Cove, and as I mentioned is quite tough. The rest is easier, but still has a number of relatively steep climbs. We took seven and a half hours (including plenty of stops), and it's definitely one of the hardest walks I've done --- harder than the Tongariro Crossing, I'd say. If I'd been carrying a full pack I would have found it really challenging. I'd definitely like to do that someday, though! So I heartily recommend the trip, but don't underestimate the difficulty.

Despite the fact that things didn't always go to plan, I think everyone had a great time. It was a great way to meet new people and get to know other people better.


Two weeks ago I did the Tongariro Crossing walk with some family members. The weather on the day was excellent and we completed the walk in about seven and half hours at a leisurely pace. This is the third time I've done the walk and I still enjoy it very much.

Thursday, 3 February 2011

Distinguishing "Embeddable" Versus "Readable" Web Resources Considered Harmful

Sorry for the "considered harmful" title, but I haven't had time to think of a better one!

The Web platform distinguishes between "embeddable" and "readable" resources. Traditionally cross-origin subresource loads, such as image loads, produce a resource that is "embeddable" (e.g. rendered) but not "readable" (i.e. exposing the conents of the resource to the containing page).

However, the distinction between "embedding" and "reading" is arbitrary, because "embedding" almost always leaks some information about the resource. For example, cross-origin image loads leak the size of the image and whether it exists. When we design new Web APIs we have to decide whether they should be allowed for embeddable-but-not-readable resources, and if they are allowed, whether any information leak is tolerable --- or else we design some mechanism to control the spread of the information leak. For example, the HTML5 canvas spec allows cross-origin images to be drawn to a canvas, but prevents the ultimate leak of pixel data by "tainting" the canvas so that getImageData no longer works.

Unfortunately, this design process has problems. Sometimes combinations of features extend the spread of information into a full leak; for example, by combining CSS pointer-events, SVG filters, IFRAMEs, and DOM APIs you might be able to extract image pixel data cross-origin.

Every time we declare that some leak of information is "tolerable", we implicitly create strange requirements for Web authors, for example "secret information must not be correlated with the sizes of any images at guessable URLs in your intranet". As another example, if we decide that fonts are innocuous and create an API that gives Web authors full access to the contents of a cross-origin-loaded font, we implicitly declare that for all time, fonts will not and must not contain origin-private data. Such decisions seem dangerous to me; they're bets against serendipity and the ingenuity of attackers.

Needless to say, every time we create restrictions or workarounds for embeddable-but-not-readable resources, we add complexity for Web authors, who, for example, may be surprised when canvas's getImageData fails due to an invisible taint bit, quite removed from the drawImage that set the taint bit.

Worst of all, it seems to me that this distinction between "embedding" and "reading" has very limited usefulness beyond compatibility with legacy requirements. What are use-cases where an author wants their resources to be embeddable cross-origin, but doesn't want other sites to be able to read the contents of the resources? The only case I can think of is "IFRAME widgets" that contain user data private to a different origin to the enclosing page. In those cases the IFRAME needs to be embeddable but retain its integrity against intrusion by the enclosing page. This use-case is actually not well-served by the IFRAME API, which suffers from serious deficiencies such as clickjacking. I don't know of any use-cases for an embedding vs reading distinction with other APIs.

I don't think the distinction between "embedding" and "reading" was a conscious design decision. It seems that when <img>, <script> and <style> first appeared, cross-origin loads were allowed because no-one had thought of the problems that would arise. Then when APIs emerged that allowed reading the full resource data (e.g. XHR and getImageData), we realized we couldn't allow that cross-origin, but we couldn't break compatibility with cross-origin embedding, so we were forced to distinguish between embedding and reading.

We're saddled with this distinction for existing APIs. However, I think we shouldn't perpetrate it in new APIs (or any API where the compatibility burden of discarding it is low). I think in new Web APIs we should refuse to distinguish between embeddable and readable resources unless there are significant use-cases that require it. In practice, since readability usually needs to default be same-origin only, we will usually need to default to denying cross-origin embedding, and let sites opt-in to cross-origin reading and embedding using CORS or some other mechanism.

People I respect disagree, on the grounds that allowing cross-origin embedding for new resource types is more consistent with existing practice and author expectations. Their point is valid but I think that consistency is outweighed by the problems above.