Wednesday, 27 November 2019

Your Debugger Sucks

Author's note: Unfortunately, my tweets and blogs on old-hat themes like "C++ sucks, LOL" get lots of traffic, while my messages about Pernosco, which I think is much more interesting and important, have relatively little exposure. So, it's troll time.

TL;DR Debuggers suck, not using a debugger sucks, and you suck.

If you don't use an interactive debugger then you probably debug by adding logging code and rebuilding/rerunning the program. That gives you a view of what happens over time, but it's slow, can take many iterations, and you're limited to dumping some easily accessible state at certain program points. That sucks.

If you use a traditional interactive debugger, it sucks in different ways. You spend a lot of time trying to reproduce bugs locally so you can attach your debugger, even though in many cases those bugs have already been reproduced by other people or in CI test suites. You have to reproduce the problem many times as you iteratively narrow down the cause. Often the debugger interferes with the code under test so the problem doesn't show up, or not the way you expect. The debugger lets you inspect the current state of the program and stop at selected program points, but doesn't track data or control flow or remember much about what happened in the past. You're pretty much stuck debugging on your own; there's no real support for collaboration or recording what you've discovered.

If you use a cutting-edge record and replay debugger like rr, it sucks less. You only have to reproduce the bug once and the recording process is probably less invasive. You can reverse-execute for a much more natural debugging experience. However, it's still hard to collaborate, interactive response can be slow, and the feature set is mostly limited to the interface of a traditional debugger even though there's much more information available under the hood. Frankly, it still sucks.

Software developers and companies everywhere should be very sad about all this. If there's a better way to debug, then we're leaving lots of productivity — therefore money — on the table, not to mention making developers miserable, because (as I mentioned) debugging sucks.

If debugging is so important, why haven't people built better tools? I have a few theories, but I think the biggest reason is that developers suck. In particular, developer culture is that developers don't pay for tools, especially not debuggers. They have always been free, and therefore no-one wants to pay for them, even if they would credibly save far more money than they cost. I have lost count of the number of people who have told me "you'll never make money selling a debugger", and I'm not sure they're wrong. Therefore, no-one wants to invest in them, and indeed, historically, investment in debugging tools has been extremely low. As far as I know, the only way to fix this situation is by building tools so much better than the free tools that the absurdity of refusing to pay for them is overwhelming, and expectations shift.

Another important factor is that the stagnation of debugging technology has stunted the imagination of developers and tool builders. Most people have still never even heard of anything better than the traditional stop-and-inspect debugger, so of course they're not interested in new debugging technology when they expect it to be no better than that. Again, the only cure I see here is to push harder: promulgation of better tools can raise expectations.

That's a cathartic rant, but of course my ultimate point is that we are doing something positive! Pernosco tackles all those debugger pitfalls I mentioned; it is our attempt to build that absurdly better tool that changes culture and expectations. I want everyone to know about Pernosco, not just to attract the customers we need for sustainable debugging investment, but so that developers everywhere wake up to the awful state of debugging and rise up to demand an end to it.

Friday, 15 November 2019

Supercharging Gdb With Pernosco

At Pernosco we don't believe that gdb (or any other traditional debugger) has the ideal debugging interface, but users familiar with gdb can get started with Pernosco more easily if gdb is available. Also, gdb has many useful features and it will be a long time before Pernosco matches every single one of those features. Therefore we've integrated gdb into Pernosco. Moreover, leveraging Pernosco's infrastructure makes our gdb experience clearly the best gdb experience ever — in my opinion!

The best feature of Pernosco-gdb is speed. We built a gdbserver that extracts information from the Pernosco omniscient database, so forward or reverse execution in gdb is simply a matter of updating an internal timestamp and getting results for that time whenever gdb asks for information. Our users routinely skip forward or backward through minutes of application execution with a second or two delay in the UI. Achieving this is harder than you might think; gdb uses internal breakpoints to observe, e.g., loading and unloading of shared libraries, so we need tricks to avoid unnecessary internal stops.

I'm also proud of the synchronization we achieved between gdb and the rest of Pernosco. Our gdb sessions track changes to the current moment and stack frame made by other parts of Pernosco (including other gdb sessions!), and likewise gdb "execution" and changes to the current stack frame are reflected in the rest of Pernosco. (Pernosco stack walking is considerably more robust than gdb's, so synchronization can fail when gdb disagrees about what stack frames exist.) Starting a gdb session attaches it to the current process, so you can have multiple gdb sessions open, for the same process and/or different processes, all synchronized as much as possible.

Pernosco enables some nice improvements to watchpoints and breakpoints, as described here.

Deploying debugging as a cloud service enables some other improvements in the gdb experience. Gdb sessions run on our servers and users can run arbitrary code in them, so by necessity we put each session in a very tight sandbox. This means we don't have to worry about potentially malicious gdbinit scripts, DWARF debuginfo, etc; we can just configure gdb to trust everything.

Another benefit of the Pernosco model is that we take responsibility for configuring gdb optimally and keeping it up to date. For example we ensure .gdb_index files are always built.

We achieve all this with minimal changes to upstream gdb. Currently our only change is to disable gdb's JIT debugging support, because that creates a lot of usually-unnecessary internal breakpoint stops. We publish the sources for our build of gdb (though technically we don't have to, as gdb is not AGPL).

Tuesday, 12 November 2019

The Power Of Collaborative Debugging

An under-appreciated problem with existing debuggers is that they lack first-class support for collaboration. In large projects a debugging session will often cross module boundaries into code the developer doesn't understand, making collaboration extremely valuable, but developers can only collaborate using generic methods such as screen-sharing, physical co-location, copy-and-paste of terminal transcripts, etc. We recognized the importance of collaboration, and Pernosco's cloud architecture makes such features relatively easy to implement, so we built some in.

The most important feature is just that any user can start a debugging session for any recorded execution, given the correct URL (modulo authorization). We increase the power of URL sharing by encoding in the URL the current moment and stack frame, so you can copy your current URL, paste it into chat or email, and whoever clicks on it will jump directly to the moment you were looking at.

The Pernosco notebook takes collaboration to another level. Whenever you take a navigation action in the Pernosco UI, we tentatively record the destination moment in the notebook with a snippet describing how you got there, which you can persist just by clicking on. You can annotate these snippets with arbitrary text, and clicking on a snippet will return to that moment. Many developers already record their progress by taking notes during debugging sessions (I remember using Borland Sidekick for this when I was a kid!); the Pernosco notebook makes this much more convenient. Our users find that the notebook is great for mitigating the "help, I'm lost in a vast information space" problem that affects Pernosco users as well as users of traditional debuggers (sometimes more so in Pernosco, because it enables higher velocity through that space). Of course the notebook persists indefinitely and is shared between all users and sessions for the same recording, so you have a permanent record of what you discovered that your colleagues can also explore and add to.

Our users are discovering that these features unlock new workflows. A developer can explore a bug, recording what they've learned in the code they understand, then upon reaching unknown code forward the debugging session to a more knowledgeable developer for further investigation — or perhaps just to quickly confirm a hypothesis. We find that, perhaps unexpectedly, Pernosco can be most effective at saving the time of your most senior developers because it's so much easier to leverage the debugging work already done by other developers.

Collaboration via Pernosco is beneficial not just to developers but to anyone who can reproduce bugs and upload them to Pernosco. Our users are discovering that if you want a developer to look into a bug you care about, submitting it to Pernosco yourself and sending them a link makes it much more likely they will oblige — if it only takes a minute or two to start poking around, why not?

Extending this idea, Pernosco makes it convenient to separate reproducing a bug from debugging a bug. It's no problem to have QA staff reproduce bugs and submit them to Pernosco, then hand Pernosco URLs to developers for diagnosis. Developers can stop wasting their time trying to replicate the "steps to reproduce" (and often failing!) and staff can focus on what they're good at. I think this could be transformative for many organizations.

Thursday, 7 November 2019

Omniscient Printf Debugging In Pernosco

Pernosco supports querying for the execution of specific functions and the execution of specific source lines. These resemble setting breakpoints on functions or source lines in a traditional debugger. Traditional debuggers usually let you filter breakpoints using condition expressions, and it's natural and useful to extend that to Pernosco's execution views, so we did. In traditional debuggers you can get the debugger to print the values of specified expressions when a breakpoint is hit, and that would also be useful in Pernosco, so we added that too.

These features strongly benefit from Pernosco's omniscient database, because we can evaluate expressions at different points in time — potentially in parallel — by consulting the database instead of having to reexecute the program.

These features are relatively new and we don't have much user experience with them yet, but I'm excited about them because while they're simple and easily understood, they open the door to "query-based debugging" strategies and endless possibilities for enhancing the debugger with richer query features.

Another reason I'm excited is that together they let you apply "printf-debugging" strategies in Pernosco: click on a source line, and add some print-expressions and optionally a condition-expression to the "line executions" view. I believe that in most cases where people are used to using printf-debugging, Pernosco enables much more direct approaches and ultimately people should let go of those old habits. However, in some situations some quick logging may still be the fastest way to figure out what's going on, and people will take time to learn new strategies, so Pernosco is great for printf-debugging: no rebuilding, and not even any reexecution, just instant(ish) results.

Monday, 4 November 2019

The BBC's "War Of The Worlds"

Very light spoilers ahead.

I had hopes for this show. I liked the book (OK, when I read it >30 years ago). I love sci-fi. I like historical fiction. I was hoping for Downton Abbey meets Independence Day. Unfortunately I think this show was, as we say in NZ, "a bit average".

I really liked the characters reacting semi-realistically to terror and horror. It always bothers me that in fiction normal people plunge into traumatic circumstances, scream a bit, then get over it in time for the next scene. This War Of the Worlds takes time to show characters freaking out, resting and consoling one another, but not quite getting it all back together. Overall I thought the acting was well done.

I think the pacing and editing were poor. Some parts were slow, but other parts (especially in the first half) lurch from scene to scene so quickly it feels like important scenes were cut. It was hard to work out was going on geographically.

Some aspects seemed pointlessly complicated or confusing, e.g. the spinning ball weapon.

Call me old-fashioned, but when a man abandons his wife I am not, by default, sympathetic to him, so I spent most of the show thinking our male protagonist is kind of a bad guy, when I'm clearly supposed to be siding with him against closed-minded society. I even felt a bit vindicated when towards the end his lover Amy wonders if they did the right thing. At least for a change the Christian-esque character was only a fool, not a psychopath, so thank God for small mercies.

I guess I'm still waiting for the perfect period War Of The Worlds adaptation.

Saturday, 2 November 2019

Explaining Dataflow In Pernosco

Tracing dataflow backwards in time is an rr superpower. rr users find it incredibly useful to set hardware data watchpoints on memory locations of interest and reverse-continue to find where those values were changed. Pernosco takes this superpower up a level with its dataflow pane (see the demo there).

From the user's point of view, it's pretty simple: you click on a value and Pernosco shows you where it came from. However, there is more going on here than meets the eye. Often you find that the last modification to memory is not what you're looking for; that the value was computed somewhere and then copied, perhaps many times, until it reached the memory location you're inspecting. This is especially true in move-heavy Rust and C++ code. Pernosco detects copying through memory and registers and follows dataflow backwards through them, producing an explanation comprising multiple steps, any of which the user can inspect just by clicking on them. Thanks to omniscience, this is all very fast. (Jeff Muizelaar implemented something similar with scripting gdb and rr, which partially inspired us. Our infrastructure is a lot more powerful what he had to work with.)

Pernosco explanations terminate when you reach a point where a value was derived from something other than a CPU copy: e.g. an immediate value, I/O, or arithmetic. There's no particular reason why we need to stop there! For example, there is obviously scope to extend these explanations through arithmetic, to explore more general dataflow DAGs, though intelligible visualization would become more difficult.

Pernosco's dataflow explanations differ from what you get with gdb and rr in an interesting way: gdb deliberately ignores idempotent writes, i.e. writes to memory that do not actually change the value. We thought hard about this and decided that Pernosco should not ignore them. Consider a trivial example:

x = 0;
y = 0;
x = y;
If you set a watchpoint on x at the end and reverse-continue, gdb+rr will break on x = 0. We think this is generally not what you want, so a Pernosco explanation for x at the end will show x = y and y = 0. I don't know why gdb behaves this way, but I suspect it's because gdb watchpoints are sometimes implemented by evaluating the watched expression over time and noting when the value changes; since that can't detect idempotent writes, perhaps hardware watchpoints were made to ignore idempotent writes for consistency.

An interesting observation about our dataflow explanations is that although the semantics are actually quite subtle, even potentially confusing once you dig into them (there are additional subtleties I haven't gone into here!), users don't seem to complain about that. I'm optimistic that the abstraction we provide matches user intuitions closely enough that they skate over the complexity — which I think would be a pretty good result.

(One of my favourite moments with rr was when a Mozilla developer called a Firefox function during an rr replay and it printed what they expected it to print. They were about to move on, but then did a double-take, saying "Errrr ... what just happened?" Features that users take for granted but are actually mind-boggling are the best features.)

Thursday, 31 October 2019

Improving Debugging Workflow With Pernosco

One of the key challenges for debuggers is that the traditional interactive debugging workflow — running your program interactively and starting it under the debugger or connecting to it once it's running, and pausing it to inspect its state — doesn't work well for a lot of people anymore. That workflow isn't convenient when the application normally doesn't run locally — e.g. because testing more often happens in CI, or on a phone, or the code you care about runs as part of a big distributed system. It also falls down when pausing the debuggee breaks the system. As software has increasingly moved to the cloud and mobile platforms, this has become a bigger deal and it's no wonder use of interactive debugging has waned. "Remote debugging" helps a bit, but it tends to be painful and although it can bridge gaps between machines, it doesn't bridge gaps in time.

We've published a couple of documents on how Pernosco tackles this, in particular how Pernosco integrates with CI and how Pernosco supports uploads from developers and QA (manual and automatic). A big part of the solution is just record-and-replay (with rr in our case). Being able to record execution on one machine, without stopping the application, and replay execution on another machine at another time, enables a lot of new workflows that mitigate the above problems. However Pernosco goes further in some important ways.

One issue is that just being able to replay execution isn't enough; we also want a good debugging experience during the replay. This means we need to capture compiled debuginfo, source code and other relevant information that aren't strictly necessary for the replay. In many cases that data isn't even available at the recording site, but it might be available somewhere (e.g. a symbol server or build artifact archive) for us to get later. So our debugging infrastructure has to support collecting information at the recording site, harvesting it from various sources later, and actually using it during the debugging session. This is not at all trivial, and Pernosco has a lot of code to handle this sort of thing, some of which needs to be customized for specific customers. For example, Pernosco identifies Firefox binaries built by Mozilla CI and knows how to locate the relevant symbols and sources from Mozilla's archives. For developer and QA-submitted recordings, Pernosco examines the trace to locate relevant debuginfo and source code and upload them. For source code hosted in well-known public repositories (e.g. mozilla-central or Github), we minimize overhead by uploading only local changes and having our debugger client fetch the public changes from the public repository at debugging time.

Note that rr on its own provides trace portability but debugging ported traces is tricky. With rr pack and rr record --disable-cpuid-features, it is generally possible to create rr recordings that can be replayed on other machines. However, when you replay with gdb, locating symbols and source files is problematic when the replay machine filesystem does not exactly match the recording machines. For example when gdb sees the shared-library loader load /home/roc/libfoo.so, that file might not be present at that location on the replay machine (or worse, it might be a different version) so gdb won't load the right symbols. You can try to work around this by populating a "sysroot" directory with the relevant files, copied and renamed from the trace, but figuring out which trace files need to go where is hard (because e.g. it depends on the symlinks present on the recording machine, which rr doesn't capture in the recording, and it's not even clear how you'd do that).

Another important feature for enabling new workflows is just having a cloud-based Web client. We want to minimize the barrier to getting into a debugging session, and it's hard to think of an easier way than publishing a link which the user clicks on to enter a specific debugging session — no installation, no configuration. Those links can be published wherever you already notify users about test failures.

One thing I'm really excited about is that Pernosco enables splitting failure reproduction from debugging. Traditionally, developers had to reproduce a bug locally when they wanted to use an interactive debugger to debug it. Pernosco lets you delegate the reproduction step to other people (or automation). For example, when QA staff find a bug, instead of writing down the steps to reproduce to send to a developer (and inevitably having a back-and-forth discussion about exactly what's required to reproduce the bug, etc), QA can upload a recording to Pernosco and pass the link to the developer. This saves time and money — especially when QA staff are cheaper and/or more scalable then your developer team.