Monday, 13 June 2016

"Safe C++ Subset" Is Vapourware

In almost every discussion of Rust vs C++, someone makes a comment like:

the subset of C++14 that most people will want to use and the guidelines for its safe use are already well on their way to being defined ... By following the guidelines, which can be verified statically at compile time, the same kind of safeties provided by Rust can be had from C++, and with less annotation effort.
This promise is vapourware. In fact, it's classic vapourware in the sense of "wildly optimistic claim about a future product made to counter a competitor". (Herb Sutter says in comments that this wasn't designed with the goal of "countering a competitor" so I'll take him at his word (though it's used that way by others). Sorry Herb!)

(FWIW the claim quoted above is actually an overstatement of the goals of the C++ Core Guidelines to which it refers, which say "our design is a simpler feature focused on eliminating leaks and dangling only"; Rust provides important additional safety properties such as data-race freedom. But even just the memory safety claim is vapourware.)

To satisfy this claim, we need to see a complete set of statically checkable rules and a plausible argument that a program adhering to these rules cannot exhibit memory safety bugs. Notably, languages that offer memory safety are not just claiming you can write safe programs in the language, nor that there is a static checker that finds most memory safety bugs; they are claiming that code written in that language (or the safe subset thereof) cannot exhibit memory safety bugs.

AFAIK the closest to this C++ gets is the Core Guidelines Lifetimes I and II document, last updated December 2015. It contains only an "informal overview and rationale"; it refers to "Section III, analysis rules (forthcoming this winter)", which apparently has not yet come forth. (I'm pretty sure they didn't mean the New Zealand winter.) The informal overview shows a heavy dependence on alias analysis, which does not inspire confidence because alias analysis is always fragile. The overview leaves open critical questions about even trivial examples. Consider:

unique_ptr<int> p;
void foo(const int& v) {
  p = nullptr;
  cout << v;
}
void bar() {
  p = make_unique(7);
  foo(*p);
}
Obviously this program is unsafe and must be forbidden, but what rule would reject it? The document says
  • In the function body, by default a Pointer parameter param is assumed to be valid for the duration of the function call and not depend on any other parameter, so at the start of the function lset(param) = param (its own lifetime) only.
  • At a call site, by default passing a Pointer to a function requires that the argument’s lset not include anything that could be invalidated by the function.
Clearly the body of foo is OK by those rules. For the call to foo from bar, it depends on what is meant by "anything that could be invalidated by the function". Does that include anything reachable via global variables? Because if it does, then you can't pass anything reachable from a global variable to any function by reference, which is crippling. But if it doesn't, then what rejects this code?

Update Herb points out that example 7.1 covers a similar situation with raw pointers. That example indicates that anything reachable through a global variable cannot be passed by to a function by raw-pointer or reference. That still seems like a crippling limitation to me. You can't, for example, copy-construct anything (indirectly) reachable through a global variable:

unique_ptr<Foo> p;
void bar() {
  p = make_unique<Foo>(...);
  Foo xyz(*p); // Forbidden!
}

This is not one rogue example that is easily addressed. This example cuts to the heart of the problem, which is that understanding aliasing in the face of functions with potentially unbounded side effects is notoriously difficult. I myself wrote a PhD thesis on the subject, one among hundreds, if not thousands. Designing your language and its libraries from the ground up to deal with these issues has been shown to work, in Rust at least, but I'm deeply skeptical it can be bolted onto C++.

Q&A

Aren't clang and MSVC already shipping previews of this safe subset? They're implementing static checking rules that no doubt will catch many bugs, which is great. They're nowhere near demonstrating they can catch every memory safety bug.

Aren't you always vulnerable to bugs in the compiler, foreign code, or mistakes in the safety proofs, so you can never reach 100% safety anyway? Yes, but it is important to reduce the amount of trusted code to the minimum. There are ways to use machine-checked proofs to verify that compilation and proof steps do not introduce safety bugs.

Won't you look stupid when Section III is released? Occupational hazard, but that leads me to one more point: even if and when a statically checked, plausibly safe subset is produced, it will take significant experience working with that subset to determine whether it's viable. A subset that rejects core C++ features such as references, or otherwise excludes most existing C++ code, will not be very compelling (as acknowledged in the Lifetimes document: "Our goal is that the false positive rate should be kept at under 10% on average over a large body of code").

20 comments:

  1. This matches my experience. MSVC's checker doesn't handle shared_ptr and the work being done on the clang checker seems stalled. I've had an open issue on how shared_ptr will be handled since February and there hasn't been much indication that there's a workable solution.

    ReplyDelete
  2. Interesting point about aliasing being hard. There was a discussion on swift-users about aliasing issues with "inout" parameters, and someone pointed to a very long design document, that also looks like a bunch of voodoo, basically promising "less unsafe" behavior. https://lists.swift.org/pipermail/swift-users/Week-of-Mon-20160606/002181.html

    ReplyDelete
  3. Interesting point about aliasing being hard. There was a discussion on swift-users about aliasing issues with "inout" parameters, and someone pointed to a very long design document, that also looks like a bunch of voodoo, basically promising "less unsafe" behavior. https://lists.swift.org/pipermail/swift-users/Week-of-Mon-20160606/002181.html

    ReplyDelete
  4. Note that the code example doesn't compile as-is with Clang 3.8 and libc++, as the make_unique call fails to automatically deduce the template parameter. make_unique makes it work. Good example!

    ReplyDelete
  5. Oh, i see, the < and > were eaten away by HTML parsing. I meant make_unique<int> , I suppose that's what happened to your snippet.

    ReplyDelete
  6. Author here (of the Lifetimes paper you're mentioning):

    It is a work in progress, and I expected to get more done over the winter, but C++17 is happening: Things got very busy over the past 6-9 months as ISO C++ is now at the "feature freeze" phase of its cycle which happens next week, when we're closing the feature set of C++17 and sending it out for its primary international comment ballot over the summer. This phrase always causes extra work in the run-up months approaching feature freeze as everyone collectively works on what can make the cut for this release. Because I lead ISO C++ organizationally (as well as getting some of my own proposals into C++17), and of course Bjarne is heavily involved in its evolution technically, C++17 has had to take priority so as to get the standard out on tome, and it meant we've been distracted from other important things, including this. I plan to pick Lifetimes up again this summer.

    The last comment in the post is absolutely correct that experience is needed to establish how well our approach works. That also takes time, but it's essential. So part of my work over the next year will be applying it to more and more existing real-world code; to paraphrase an NBA expression, "real code don't lie."

    BTW, I like Rust (though I don't use it); there are lots of good languages. I'm not aware that Bjarne or I ever said anything about Rust, and this work wasn't designed with Rust in mind, or to compete with it; we're just working on making C++ better, and this is part of starting to surface results from work that I've been spending time on over the past decade. I hope that improving one language isn't viewed as a threat to another; in any case that's not a motivation here. FWIW, several of the Rust designers approached us about our Lifetimes design after the talks and again this month, and I think we're going to get together over the summer to chat (after this next standards meeting is over and we all get to recover for a few weeks).

    ReplyDelete
    Replies
    1. Thanks for being gracious. I've updated my post to avoid impugning your motives.

      Delete
  7. P.S.: Oh, and that code example is covered by the rule you cite:

    "At a call site, by default passing a Pointer to a function requires that the argument’s lset not include anything that could be invalidated by the function."

    The call to foo(*p) is passing a reference (which is a a non-owning Pointer) whose lset is {p'}, and p' can be invalidated by writing to p. Because p is global and writeable it could be modified by foo(), and so the call foo(*p) would be diagnosed as an error because foo() could invalidate the reference parameter.

    (My paper doesn't explicitly cover raw references and for simplicity mentions only raw pointers, leaving the references to be treated analogously, but you can see this covered by what is in the paper if you just change the parameter to a * instead of &, and change the call to foo(p.get()) instead of foo(*p). -- Same example, just passing a raw pointer instead of using a reference, and now it's explicitly covered by the rules in the paper.)

    ReplyDelete
    Replies
    1. Thanks for that clarification. I've updated my post.

      So you're taking the approach that anything reachable from a global variable cannot be passed by reference to a function. This seems pretty severe to me since by "reachable" we must include "indirectly reachable". Talking about this in terms of references also helps show how severe this is, since it means you can't for example copy data reachable through a global variable using copy-constructors.

      Delete
    2. Almost right: Anything *owned* by a *mutable* global variable could have the bug you just showed. If it's not an owner, or not mutable, the problem doesn't arise. And of course mutable global variables are problematic for many reasons, so we should all be discouraging those (e.g., if there must be a widely shared object, encapsulate it)...

      I agree the rule will likely be noisy, especially in existing code. My overall goal is to annotate as little as possible, and for the annotations you do write for legacy code to be mostly of the "trust me / suppress this warning" variety. Then you can make statements of the form "type- and memory-safe except for where rules are explicitly suppressed" which is the guarantee I'm aiming for, so that if a safety error does occur we've greatly reduced the surface area that needs to be checked or debugged by a human, and made all unsafe code greppable because it's annotated.

      Delete
    3. Actually I think this means you can't call nonstatic member functions on objects owned via mutable global variables. Which means you might as well ban ownership via mutable global variables completely.

      I think I see some other severe problems too but I guess I should just wait for Lifetimes III to come out.

      Delete
    4. It'll be great when the Core Lifetimes Checker gets to the point that it can assert the program is "memory-safe except for where rules are explicitly suppressed". It's hard to estimate, but presumably there will be a bunch of cases, even in new code, that'll call for some rule suppression in practice. For example, when you're holding an iterator to a linked list element and then delete a different element in the list. You may know that it's safe, but the lifetimes checker probably won't, right?

      So I propose the option of automatically injecting run-time checks for those situations where the checker can't verify safety. It would add some (perhaps unnecessary) run-time cost, but it would allow users to know that their entire program is memory safe, not just the "unsuppressed" part. Optionally.

      I mean, the number of injected run-time checks (and thus the extra run-time cost) should be modest, because the checker is supposed to be able to recognize most safe situations, right? I think there would be huge demand for such an option. Take for example the Tor Browser project. They are so desperate for a memory safe version of the Firefox browser that they have resorted to building with the llvm/gcc sanitizers enabled [1], even though those sanitizers exact a large performance penalty are otherwise not meant to be used in production builds. There are lots of applications that prioritize memory safety over performance. Even in performance critical applications, often only a minority of the code is actually performance critical. So the rest of the code can probably handle a few extra run-time checks in exchange for increased memory safety.

      Like I mentioned in the other comment, SaferCPlusPlus[2] is a library that allows you to achieve memory safety by providing compatible safe replacements for C++'s potentially dangerous elements. It has examples of pointers and iterators made safe with run-time checks (as well as others that use compile-time restrictions).

      One nice thing about automatically injecting run-time checks to achieve memory safety is that as the checker/analyzer gets smarter, the code generated will get faster. That is, fewer run-time checks will be injected as the checker/analyzer is able to recognize more code situations that are actually safe.

      [1] https://blog.torproject.org/blog/tor-browser-55a4-hardened-released
      [2] https://github.com/duneroadrunner/SaferCPlusPlus

      Delete
    5. And this is going a bit further, but a possibly more desirable feature might be the ability to actually remove existing run-time checks in the code that are determined to be unnecessary. Basically using the lifetimes analysis to act as a specialized code optimizer as well. This would allow the programmer to write intrinsically safe code, using run-time checks (or a library that uses run-time checks), while being confident that most of the unnecessary checks will be stripped out so as to not hurt performance. While ostensibly, this produces the same result as the "run-time check injection" approach, it actually has a couple of advantages. First, you can apply any number of different "run-time check discarding" optimizers to your code, knowing that each optimizer can only improve the performance (or do nothing).

      But more importantly, this approach scales up beyond memory safety to more general notions of code correctness. That is, memory safety is just the enforcement of certain invariants. But generally, programs have other (application level) invariants that could also use some enforcement. Since those other invariants may not have specialized static checkers like the memory invariants do, they may need to be enforced with run-time checks. You could imagine more generalized (but possibly user guided/assisted) "run-time check discarding" optimizers attempting to discard those "application invariant" run-time checks. And if they are only able to discard a portion of those run-time checks, well fine. Something's better than nothing.

      And of course all these optimizers would report (to the programmer) which run-time checks they were and weren't able to discard.

      So basically I'm contrasting the two approaches to code safety - "static checking with false positives" versus "not completely optimized-out run-time checks". I'm suggesting the latter may be a more practical approach for code safety in general. But someone probably needs to establish a standard framework for applying it.

      Delete
  8. I observed essentially the same problem with shared_ptr and wrote about it a few months ago: https://news.ycombinator.com/item?id=10912127

    It's even more serious with shared_ptr than the global variables problem, and unless I'm missing something huge I think it kind of cripples the whole project with no way I can see to repair it.

    ReplyDelete
  9. This post is old, but I just ran across it:

    So first, in the literal sense, a practical "Safe C++ Subset" does exist - the SaferCPlusPlus[1] library provides (memory) safe, compatible substitutes for C++'s unsafe elements (pointers/references, arrays, vectors). One caveat is that in order to achieve complete memory safety you need to replace all uses of unsafe elements, including implicit uses of the "this" pointer. Which might seem a little unnatural or tedious at first. And there are also restrictions on how objects are shared among threads.

    I think the main difference between Rust's and SaferCPlusPlus' approach is that in order to prevent a dynamic object from being deallocated while other references to the object still exist, Rust imposes the (compile-time) restriction that a mutable reference is only allowed to exist when no other references to the same object exist. Instead of imposing restrictions, SaferCPlusPlus, when necessary, checks dereferences to dynamic objects at run-time. These run-time checks are (not rare, but) actually rarer than you might think. In my experience, the vast majority occurring in dereferences to vector<> elements.

    Now, it sounds like the goal of the C++ Core (Lifetimes) Checker is to identify all potential lifetime bugs (including some false positives). (I.e. that the Lifetimes Checker function as a Lifetimes "Verifier".) If they achieve this, then the SaferCPlusPlus library can simply be used as a convenient way to eliminate any potential concerns the Lifetimes Verifier might have with your code.

    Also, I don't know Rust very well at all, and someone tell me if I'm wrong - but with the Rust code I've encountered so far, it seems to me that it would all translate fairly directly to C++ code (library differences not withstanding). So in that sense, doesn't Rust itself technically define a safe subset of C++? Or is there a simple example of Rust code that doesn't translate directly to C++?

    > "then you can't pass anything reachable from a global variable to any function by reference, which is crippling."

    You're defending global variables as not just desirable, but indispensable? I have to agree with Herb on this one.

    [1] https://github.com/duneroadrunner/SaferCPlusPlus

    ReplyDelete
  10. > Or is there a simple example of Rust code that doesn't translate directly to C++?

    Sure:

    struct Foo {
    b: Bar;
    }
    impl Foo {
    fn b(&self) { &self.b }
    }

    'b' is a simple getter method that returns a reference to the inner field. This *appears* to translate directly to C++, but the Rust version gets compile-time checking that the reference to b does not outlive the 'Foo', while the C++ version does not, so the C++ version is actually very different.

    It looks like you could use SaferCPlusPlus to express this using registered pointers, but that would impose run-time overhead and increase the size of 'Foo'. Also, it appears the developer would have to decide up-front whether to allow the reference to pass across threads (and be responsible for making sure that decision is followed); if it is allowed to pass across threads, then you have to use atomic ops for pointer (de)registration, which would be considerably more overhead. In Rust it is possible to pass the reference across threads safely with no overhead. So Rust is lower overhead (and safer because threading assumptions are checked rather than assumed).

    > You're defending global variables as not just desirable, but indispensable?

    I'm not a huge fan of global variables but sometimes you need them (sometimes thread-local). Besides, global variables are a longstanding feature used by lots of C++ code and it's disingenuous to say "we'll statically verify the memory safety of your C++ code ... [fine print] as long as you restrict your code to a subset that doesn't look like C++ as you know it".

    ReplyDelete
  11. > This *appears* to translate directly to C++, but the Rust version gets compile-time checking that the reference to b does not outlive the 'Foo', while the C++ version does not, so the C++ version is actually very different.

    Yeah, but if you took a safe Rust program and directly translated into C++ (assuming there's a direct translation), then you'd know the C++ program was also safe. Right? Because the direct translation would essentially "inherit" the compile-time safety checking of the Rust program.

    This in itself is not a particularly useful observation. But if true, it would imply that the Rust compile-time checker itself could be translated to apply to C++ code. I mean if the Rust checker simply recognizes safe Rust code, and Rust code can be directly translated into corresponding C++ code, then it should be straight-forward to convert the Rust checker to recognize C++ code that happens to correspond to safe Rust code. Right?

    Of course the subset of C++ code that corresponds to safe Rust code would not be typical/idiomatic C++ code. But it technically would be a "Safe C++ Subset". And it would beg the question, what are the advantage (and disadvantanges) of writing programs in the Rust language as opposed to writing those same programs in the corresponding C++ subset?

    > It looks like you could use SaferCPlusPlus to express this using registered pointers

    You could, but SaferCPlusPlus also has "zero run-time cost" "scope pointers" that basically correspond to Rust's references [1]. But unlike Rust, they can only target objects with scope lifetimes, as opposed to objects with dynamic lifetimes like vector elements. Because SaferCPlusPlus doesn't have the benefit of a static checker (yet), it does have to resort to run-time checks more often than Rust (but less often than you might think), and basically relies on the compiler optimizer to act as a poor man's static analyzer and discard the run-time checks it can determine to be unnecessary. Presumably, the more Rust-like your code is, the easier for the optimizer (because of the aliasing challenges, right?).

    > it's disingenuous to say "we'll statically verify the memory safety of your C++ code ... [fine print] as long as you restrict your code to a subset that doesn't look like C++ as you know it

    I don't know, if you can't exclude (at least mutable) globals (which have been frowned upon forever) from the subset, then what can you exclude?

    Even if the "safe C++ subset" ends up having to exclude large chunks of traditional C++, I'm guessing it will still end up being significantly less restrictive than Rust. And it will still be easier to update existing C++ code to be safe than to rewrite it in Rust.

    So are you skeptical that the Core Lifetimes Checker will ever be able to guarantee memory safety? Or just that it's restrictions will end up being as draconian as Rust's anyway?

    [1] http://duneroadrunner.github.io/SaferCPlusPlus/#safercplusplus-versus-rust

    ReplyDelete
    Replies
    1. > if you took a safe Rust program and directly translated into C++ (assuming there's a direct translation), then you'd know the C++ program was also safe. Right?

      Can you compile Rust code to C++ as an intermediate language? Yes.

      > it should be straight-forward to convert the Rust checker to recognize C++ code that happens to correspond to safe Rust code. Right?

      No, because Rust includes features like lifetime parameters which have no analogue in C++ and are essential for compile-time Rust safety checks.

      > they can only target objects with scope lifetimes, as opposed to objects with dynamic lifetimes like vector elements

      Which therefore cannot be used for the Rust code I mentioned, which can be used with heap objects.

      > Presumably, the more Rust-like your code is, the easier for the optimizer (because of the aliasing challenges, right?).

      True, but that does not imply you can automatically optimize from SaferCPlusPlus to what Rust could generate. It's easy to fantasize about what an optimizer could do, but there are practical limits.

      > if you can't exclude (at least mutable) globals (which have been frowned upon forever) from the subset, then what can you exclude?

      I don't know, but that's a problem for people trying to verify safety of C++ code, not me.

      > I'm guessing it will still end up being significantly less restrictive than Rust.

      Why? It lacks lifetime parameters, which means in some situations it must be more restrictive. Why would you expect a language not designed for safety, with safety tacked on as an afterthought, to be more suitable for writing safe code than a language designed for safety from the start?

      > it will still be easier to update existing C++ code to be safe than to rewrite it in Rust.

      I guess that's possible, but it's also easy to imagine that the subset is so dramatically different from idiomatic C++ that you're effectively rewriting your code, in which case maybe you're better off rewriting in Rust.

      > So are you skeptical that the Core Lifetimes Checker will ever be able to guarantee memory safety? Or just that it's restrictions will end up being as draconian as Rust's anyway?

      Both. The lack of lifetime parameters suggests that it will necessarily be more restrictive than Rust. (Note that the trivial Rust example I introduced requires implicit use of lifetime parameters and cannot fit into the Core Guidelines model.) Above pcwalton and I talked about why we're skeptical that memory safety will ever be achieved.

      Delete
    2. FWIW that it's now a full year since the last update to the lifetime-checking document. That project doesn't look very healthy.

      Delete