Friday 5 January 2018
Meltdown/Spectre Needs Better Disclosure
There's far too much confusion around these vulnerabilities. One problem is naming: "Meltdown" is OK, but "Spectre" is being used to refer to a whole family of information leaks triggered by speculative execution, two specific examples of which are highlighted in the paper, which happen to be similar to two examples shown by Project Zero, but we don't have good names for those examples so people are calling them all kinds of things, e.g. "variant 1" and "variant 2", which isn't very helpful. Also the Spectre paper describes user-space-only attacks but Project Zero introduced attacks against the kernel and hypervisor; when someone says "we've blocked Spectre" it's not at all clear what they've done. It would have been better to have specific names for different issues that are going to be addressed separately, and a distinct name for the whole family.
We have Intel saying they're releasing microcode updates that fix everything, Google and Amazon say they've fixed everything in their clouds, but there are still efforts under way to fix various Spectre issues in the Linux kernel so obviously we're still some way away from the complete stack being fully protected, especially against Spectre "variant 1" both in the kernel and user-space. Google says that recompiling everything with retpolines blocks "variant 2", but people on LKML who seem to be in the loop with Intel say that retpolines aren't a reliable mitigation on Skylake.
Some confusion is understandable given the accelerated disclosure schedule, but I hope this gets sorted out soon. It's important for the the CPU vendors and the cloud vendors to say exactly what mitigations they have deployed, what attacks they are not mitigating, and what parts of the problem they expect their downstream customers to take responsibility for.
Another thing that has to happen: brains trusts inside the disclosure zone need to take a step back from desperate attempts to mitigate the known exploits, and figure out a long term plan for dealing with side-channel attacks leaking privileged data through supposedly-hidden hardware state. The Spectre paper portrays their attacks as the tip of an iceberg, and I suspect the authors are right. Blocking specific attacks one by one with expensive mitigations may not be sustainable — is definitely not sustainable for products that can't be on a rapid update cycle. It just got harder to write secure code, and on our current course it is going to keep getting harder. The Risc-V statement about this is self-serving but the right sentiment.