Thursday 4 March 2021
On-Premises Pernosco Now Available; Reflecting On Application Confinement
In November we announced Pernosco availability for individual developers via our debugging-as-a-service platform. That product requires developers to share binary code, debug symbols and test data with us (but not necessarily source code), and we recognize that many potential customers are not comfortable with that. Therefore we are making Pernosco available to run on-premises. Contact us for a free trial! On-prem pricing is negotiable, but we prefer to charge a fixed amount per month for unlimited usage by a given team. Keep in mind Pernosco's current limitations: applications that work with rr (Linux, x86-64), C/C++/Rust/Ada/V8.
An on-premises customer says:
One of the key takeaways for me in our evaluation is that users keep coming back to pernosco without me pushing for it, and really like it — I have rarely seen such a high adoption rate in a new tool.
To deploy Pernosco on-premises we package it into two containers: the database builder and the application server. You collect an rr trace, run the database builder, then run the application server to export a Web interface to the database. Both steps, especially database building, require a reasonably powerful machine; our hosted service uses c5d.9xlarge instances for database building and a smaller shared i3.4xlarge instance for the application servers. If you want to run your own private shared service you are responsible for any authentication, authorization, public routing to the Web interface, wrangling database storage, etc.
To help our customers feel comfortable using Pernosco on-premises, all our closed-source code is bundled into those two containers, and we provide an open-source Python wrapper which runs those containers with sandboxing to confine it. Our goal is that you should not have to trust anything other than that easily-audited wrapper script (and rr of course, which is open source, but not so easily audited, though lots of people have looked into it). Our database builder container simply reads one subtree of the filesystem and drops a database into it, and the application server reads that subtree and exposes a Web server on a private subnet. Both containers should be incapable of leaking data to the outside world, even if the contents were malicious.
This seems like a natural approach to deploying closed-source software — no-one wants to be the next Solarwinds. In fact even if you receive the entire product's source code from your vendor, you still want confinement because you're not really going to audit it effectively. Therefore I'm surprised to find that this use-case doesn't seem to be well-supported by infrastructure!
For example, Docker provides an embedded DNS server to containers that cannot be disabled. We don't need it, and in theory crafted DNS queries could leak information to the outside world, so I'd like to disable it. But apparently our confinement goal is not considered a valid use-case by Docker. (I guess you can work around this by limiting the DNS service on the container host somehow, but that sucks more.)
Another interesting issue is preventing leaks through our Web application. For example when a user loads our application, it could send messages to public Internet hosts that leak information. We try to prevent this by having our application send CSP headers that deny all non-same-origin access. The problem is, if our application was malicious it could simply change the CSP it sends. Preventing that would seem to require a proxy outside the container that checks or modifies the CSP headers we send.
I'm surprised these problems haven't already been solved, or at least that the solutions aren't widely known and deployed.