Thursday, 19 October 2017

Microsoft's Chrome Exploitation And The Limitations Of Control Flow Integrity

Microsoft published an interesting blog post about exploiting a V8 bug to achieve arbitrary code execution in a Chrome content sandbox. They rightly point out that then even if you don't escape the sandbox, you can break important Web security properties (e.g., assuming the process is allowed to host content from more than one origin, you can break same-origin restrictions). However, the message we're supposed to take away from this article is that Microsoft's CFI would prevent similar bugs in Edge from having the same impact. I think that message is basically wrong.

The problem is, once you've achieved arbitrary memory read/write from Javascript, it's very likely you can break those Web security properties without running arbitrary machine code, without using ROP, and without violating CFI at all. For example if you want to violate same-origin restrictions, your JS code could find the location in memory where the origin of the current document is stored and rewrite it to be a different origin. In practice it would quite a lot more complicated than that, but the basic idea should work, and once you've implemented the technique it could be used to exploit any arbitrary read/write bug. It might even be easier to write some exploits this way than using traditional arbitrary code execution; JS is a more convenient programming language than ROP gadgets.

The underlying technical problem is that once you've achieved arbitrary read/write you can almost completely violate data-flow integrity within the process. As I recently wrote, DFI is extremely important and (unlike CFI) it's probably impossible to dynamically enforce with low overhead in the presence of arbitrary read/write, with any reasonable granularity.

I think there's also an underlying cultural problem here, which is that traditionally "Remote Code Execution" — of unconstrained machine code — has been the gold standard for a working exploit, which is why techniques to prevent that, like CFI, have attracted so much attention. But Javascript (or some other interpreter, or even some Turing-complete interpreter-like behavior) armed with an arbitrary memory read/write primitive is just as bad in a lot of cases.

Sunday, 15 October 2017

"Slow To Become Angry"

James 1:19-20

My dear brothers and sisters, take note of this: Everyone should be quick to listen, slow to speak and slow to become angry, because human anger does not produce the righteousness that God desires.

Online media exploit the intoxicating effects of righteous anger to generate "engagement". But even — or especially — when the targets of one's anger richly deserve it, becoming a person whose life is characterized by anger is contrary to God's will.

Something to remember when I'm tempted to click on yet another link about some villain getting what he deserves.

Monday, 9 October 2017

Type Safety And Data Flow Integrity

We talk a lot about memory safety assuming everyone knows what it is, but I think that can be confusing and sell short the benefits of safety in modern programming languages. It's probably better to talk about "type safety". This can be formalized in various ways, but intuitively a language's type system proposes constraints on what is allowed to happen at run-time — constraints that programmers assume when reasoning about their programs; type-safe code actually obeys those constraints. This includes classic memory safety features such as avoidance of buffer overflows: writing past the end of an array has effects on the data after the array that the type system does not allow for. But type safety also means, for example, that (in most languages) a field of an object cannot be read or written except through pointers/references created by explicit access to that field. With this loose definition, type safety of a piece of code can be achieved in different ways. The compiler might enforce it, or you might prove the required properties mechanically or by hand, or you might just test it until you've fixed all the bugs.

One implication of this is that type-safe code provides data-flow integrity. A type system provides intuitive constraints on how data can flow from one part of the program to another. For example, if your code has private fields that the language only lets you access through a limited set of methods, then at run time it's true that all accesses to those fields are by those methods (or due to unsafe code).

Type-safe code also provides control-flow integrity, because any reasonable type system also suggests fine-grained constraints on control flow.

Data-flow integrity is very important. Most information-disclosure bugs (e.g. Heartbleed) violate data-flow integrity, but usually don't violate control-flow integrity. "Wild write" bugs are a very powerful primitive for attackers because they allow massive violation of data-flow integrity; most security-relevant decisions can be compromised if you can corrupt their inputs.

A lot of work has been done to enforce CFI for C/C++ using dynamic checks with reasonably low overhead. That's good and important work. But attackers will move to attacking DFI, and that's going to be a lot harder to solve for C/C++. For example the checking performed by ASAN is only a subset of what would be required to enforce the C++ type system, and ASAN's overhead is already too high. You would never choose C/C++ for performance reasons if you had to run under ASAN. (I guess you could reduce ASAN's overhead if you dropped all the support for debugging, but it would still be too high.)

Note 1: people often say "even type safe programs still have correctness bugs, so you're just solving one class of bugs which is not a big deal" (or, "... so you should just use C and prove everything correct"). This underestimates the power of type safety with a reasonably rich type system. Having fine-grained CFI and DFI, and generally being able to trust the assumptions the type system suggests to you, are essential for sound reasoning about programs. Then you can leverage the type system to build abstractions that let you check more properties; e.g. you can enforce separation between trusted and untrusted data by giving untrusted user input different types and access methods to trusted data. The more your code is type-safe, the stronger is your confidence in those properties.

Note 2: C/C++ could be considered "type safe" just because the specification says any program executing undefined behavior gets no behavioral constraints whatsoever. However, in practice, programmers reasoning about C/C++ code must (and do) assume the constraint "no undefined behavior occurs"; type-safe C/C++ code must ensure this.

Note 3: the presence of unsafe code within a hardware-enforced protection domain can undermine the properties of type-safe code within the same domain, but minimizing the amount of such unsafe code is still worthwhile, because it reduces your attack surface.

Legacy Code Strikes Again

This blog post describes using binary diffing to find security-relevant bugs in Windows 7 that were fixed in Windows 10. It's an interesting example of the problems you can get into if you don't fix bugs across all your product versions at about the same time.

CVE-2017-8685 is particularly sad. The system call NtGdiEngCreatePalette can leak uninitialized values from kernel memory to user-space; these leaks are quite serious security issues. This is basically the old GDI CreatePalette function. It's a system call because in Windows NT 4.0 (1996) Microsoft moved the GDI subsystem into the kernel for performance reasons. That may have made sense at the time, but at that time GDI was already a mess of complicated legacy code, so this was a large increase in attack surface that's been hurting security and stability for Microsoft users ever since.

What's especially sad about this function is that CreatePalette is only useful for palette-based displays, which became obsolete in the 1990s, around the time NT 4.0 came out. It's been about 20 years since this API was useful for anything other than compatibility with even older software ... or kernel memory disclosure!

Sunday, 8 October 2017

Thoughts On Microsoft's Time-Travel Debugger

I'm excited that Microsoft's TTD is finally available to the public. Congratulations to the team! The video is well worth watching. I haven't used TTD myself yet since I don't have a Windows system at hand, but I've talked to Mozilla developers who've tried it on Firefox.

The most important and obvious difference between TTD and rr is that TTD is for Windows and rr is for Linux (though a few crazy people have had success debugging Windows applications in Wine under rr).

TTD supports recording of multiple threads in parallel, while rr is limited to a single core. On the other hand, per-thread recording overhead seems to be much higher in TTD than in rr. It's hard to make a direct comparison, but a simple "start Firefox, display, shut down" test run on similar hardware takes about 250 seconds under TTD and 26 seconds under rr. This is not surprising given TTD relies on pervasive binary instrumentation and rr was designed not to. This means recording extremely parallel workloads might be faster under TTD, but for many workloads rr recording will be faster. Starting up a large application really stresses binary translation frameworks, so it's a bit of a worst-case scenario for TTD — though a common one for developers. TTD's multicore recording might be better at reproducing certain kinds of concurrency bugs, though rr's chaos mode helps mitigate that problem — and lower recording overhead means you can churn through test iterations faster.

Therefore for Firefox-like workloads, on Linux, I still think rr's recording approach is superior. Note that when the technology behind TTD was first developed the hardware and OS features needed to support an rr-like approach did not exist.

TTD's ability to attach to arbitrary processes and start recording sounds great and would mitigate some of the slow-recording problem. This would be nice to have with rr, but hard to implement. (Currently we require reserving a couple of pages at specific addresses that might not be available when attaching to an arbitrary process.)

Some of the performance overhead of TTD comes from it copying all loaded libraries into the trace file, to ensure traces are portable across machines. rr doesn't do that by default; instead you have to run rr pack to make traces self-contained. I still like our approach, especially in scenarios where you repeatedly re-record a testcase until it fails.

The video mentions that TTD supports shared memory and async I/O and suggests rr doesn't. It can be confusing, but to clarify: rr supports shared memory as long as you record all the processes that are using the shared memory; for example Firefox and Chromium communicate with subprocesses using shared memory and work fine under rr. Async I/O is pretty rare in Linux; where it has come up so far (V4L2) we have been able to handle it.

Supporting unlimited data breakpoints is a nice touch. I assume that's done using their binary instrumentation.

TTD's replay looks fast in the demo videos but they mention that it can be slower than live debugging. They have an offline index build step, though it's not clear to me yet what exactly those indexes contain. It would be interesting to compare TTD and rr replay speed, especially for reverse execution.

The TTD trace querying tools look cool. A lot more can be done in this area.

rr+gdb supports running application functions at debug time (e.g. to dump data structures), while TTD does not. This feature is very important to some rr users, so it might be worthwhile for the TTD people to look at.

Friday, 6 October 2017

Building On Rock, Not Sand

This quote is telling:

Billions of devices run dnsmasq, and it had been through multiple security audits before now. Simon had done the best job possible, I think. He got beat. No human and no amount of budget would have found these problems before now, and now we face the worldwide costs, yet again, of something ubiquitous now, vulnerable.

Some of this is quite accurate. Human beings can't write safe C code. Bug-finding tools and security audits catch some problems but miss a lot of others. But on the other hand, this message and its followup betray mistaken assumptions. There are languages running on commodity hardware that provide much better security properties than C. In particular, all three remote code execution vulnerabilities would have been prevented by Rust, Go or even Java. Those languages would have also made the other bugs much more unlikely. Contrary to the quote, given a finite "amount of budget", dnsmasq could have been Rewritten In Rust and these problems avoided.

I understand that for legacy code like dnsmasq, even that amount of budget might not be available. My sincere hope is that people will at least stop choosing C for new projects. At this point, doing so is professional negligence.

What about C++? In my circle I seldom see enthusiasm for C, yet there is still great enthusiasm for C++, which inherits C's core security weaknesses. Are the C++ projects of today going to be the vulnerability-ridden legacy codebases of tomorrow? (In some cases, e.g. browsers, they already are...) C++ proponents seem to believe that C++ libraries and analysis tools, including efforts such as C++ Core Guidelines: Lifetimes, plus mitigations such as control-flow integrity, will be "good enough". Personally, I'm pessimistic. C++ is a fantastically complex language and that complexity is growing steadily. Much more effort is going into increasing its complexity than addressing safety issues. It's now nearly two years since the Lifetimes document had any sort of update, and at CppCon 2017 just one of 99 talks focused on improving C++ safety.

Those of us building code to last owe it to the world to build on rock, not sand. C is sand. C++ is better, but it's far from a solid foundation.

Microsoft Using Chromium On Android Is Bad For The Web

Microsoft is releasing "Edge for Android" and it uses Chromium. That is bad for the Web.

It's bad because engine diversity is really essential for the open Web. Having some users, even a relatively small number, using the Edge engine on Android would have been a good step. Going with Chromium increases Web developer expectations that all browsers on Android are — or even should be — Chromium. The less thoughtful sort of developer (i.e., pretty much everyone) will say "Microsoft takes this path, so why doesn't Mozilla too, so we can have the instant gratification of compatibility thanks to a single engine?" The slow accumulation of unfixable bugs due to de facto standardization will not register until the platform has thoroughly rotted; the only escape being alternative single-vendor platforms where developers are even more beholden to the vendor.

Sure, it would have been quite a lot of work to port Edge to Android, but Microsoft has the resources, and porting a browser engine isn't a research problem. If Microsoft would rather save resources than promote their own browser engine, perhaps they'll be switching to Chromium on Windows next. Of course that would be even worse for the Web, but it's not hard to believe Microsoft has stopped caring about that, to the extent they ever did.

(Of course Edge uses Webkit on iOS, and that's also bad, but it's Apple's ongoing decision to force browsers to use the least secure engine, so nothing new there.)