Tuesday, 12 November 2019

The Power Of Collaborative Debugging

An under-appreciated problem with existing debuggers is that they lack first-class support for collaboration. In large projects a debugging session will often cross module boundaries into code the developer doesn't understand, making collaboration extremely valuable, but developers can only collaborate using generic methods such as screen-sharing, physical co-location, copy-and-paste of terminal transcripts, etc. We recognized the importance of collaboration, and Pernosco's cloud architecture makes such features relatively easy to implement, so we built some in.

The most important feature is just that any user can start a debugging session for any recorded execution, given the correct URL (modulo authorization). We increase the power of URL sharing by encoding in the URL the current moment and stack frame, so you can copy your current URL, paste it into chat or email, and whoever clicks on it will jump directly to the moment you were looking at.

The Pernosco notebook takes collaboration to another level. Whenever you take a navigation action in the Pernosco UI, we tentatively record the destination moment in the notebook with a snippet describing how you got there, which you can persist just by clicking on. You can annotate these snippets with arbitrary text, and clicking on a snippet will return to that moment. Many developers already record their progress by taking notes during debugging sessions (I remember using Borland Sidekick for this when I was a kid!); the Pernosco notebook makes this much more convenient. Our users find that the notebook is great for mitigating the "help, I'm lost in a vast information space" problem that affects Pernosco users as well as users of traditional debuggers (sometimes more so in Pernosco, because it enables higher velocity through that space). Of course the notebook persists indefinitely and is shared between all users and sessions for the same recording, so you have a permanent record of what you discovered that your colleagues can also explore and add to.

Our users are discovering that these features unlock new workflows. A developer can explore a bug, recording what they've learned in the code they understand, then upon reaching unknown code forward the debugging session to a more knowledgeable developer for further investigation — or perhaps just to quickly confirm a hypothesis. We find that, perhaps unexpectedly, Pernosco can be most effective at saving the time of your most senior developers because it's so much easier to leverage the debugging work already done by other developers.

Collaboration via Pernosco is beneficial not just to developers but to anyone who can reproduce bugs and upload them to Pernosco. Our users are discovering that if you want a developer to look into a bug you care about, submitting it to Pernosco yourself and sending them a link makes it much more likely they will oblige — if it only takes a minute or two to start poking around, why not?

Extending this idea, Pernosco makes it convenient to separate reproducing a bug from debugging a bug. It's no problem to have QA staff reproduce bugs and submit them to Pernosco, then hand Pernosco URLs to developers for diagnosis. Developers can stop wasting their time trying to replicate the "steps to reproduce" (and often failing!) and staff can focus on what they're good at. I think this could be transformative for many organizations.

Thursday, 7 November 2019

Omniscient Printf Debugging In Pernosco

Pernosco supports querying for the execution of specific functions and the execution of specific source lines. These resemble setting breakpoints on functions or source lines in a traditional debugger. Traditional debuggers usually let you filter breakpoints using condition expressions, and it's natural and useful to extend that to Pernosco's execution views, so we did. In traditional debuggers you can get the debugger to print the values of specified expressions when a breakpoint is hit, and that would also be useful in Pernosco, so we added that too.

These features strongly benefit from Pernosco's omniscient database, because we can evaluate expressions at different points in time — potentially in parallel — by consulting the database instead of having to reexecute the program.

These features are relatively new and we don't have much user experience with them yet, but I'm excited about them because while they're simple and easily understood, they open the door to "query-based debugging" strategies and endless possibilities for enhancing the debugger with richer query features.

Another reason I'm excited is that together they let you apply "printf-debugging" strategies in Pernosco: click on a source line, and add some print-expressions and optionally a condition-expression to the "line executions" view. I believe that in most cases where people are used to using printf-debugging, Pernosco enables much more direct approaches and ultimately people should let go of those old habits. However, in some situations some quick logging may still be the fastest way to figure out what's going on, and people will take time to learn new strategies, so Pernosco is great for printf-debugging: no rebuilding, and not even any reexecution, just instant(ish) results.

Monday, 4 November 2019

The BBC's "War Of The Worlds"

Very light spoilers ahead.

I had hopes for this show. I liked the book (OK, when I read it >30 years ago). I love sci-fi. I like historical fiction. I was hoping for Downton Abbey meets Independence Day. Unfortunately I think this show was, as we say in NZ, "a bit average".

I really liked the characters reacting semi-realistically to terror and horror. It always bothers me that in fiction normal people plunge into traumatic circumstances, scream a bit, then get over it in time for the next scene. This War Of the Worlds takes time to show characters freaking out, resting and consoling one another, but not quite getting it all back together. Overall I thought the acting was well done.

I think the pacing and editing were poor. Some parts were slow, but other parts (especially in the first half) lurch from scene to scene so quickly it feels like important scenes were cut. It was hard to work out was going on geographically.

Some aspects seemed pointlessly complicated or confusing, e.g. the spinning ball weapon.

Call me old-fashioned, but when a man abandons his wife I am not, by default, sympathetic to him, so I spent most of the show thinking our male protagonist is kind of a bad guy, when I'm clearly supposed to be siding with him against closed-minded society. I even felt a bit vindicated when towards the end his lover Amy wonders if they did the right thing. At least for a change the Christian-esque character was only a fool, not a psychopath, so thank God for small mercies.

I guess I'm still waiting for the perfect period War Of The Worlds adaptation.

Saturday, 2 November 2019

Explaining Dataflow In Pernosco

Tracing dataflow backwards in time is an rr superpower. rr users find it incredibly useful to set hardware data watchpoints on memory locations of interest and reverse-continue to find where those values were changed. Pernosco takes this superpower up a level with its dataflow pane (see the demo there).

From the user's point of view, it's pretty simple: you click on a value and Pernosco shows you where it came from. However, there is more going on here than meets the eye. Often you find that the last modification to memory is not what you're looking for; that the value was computed somewhere and then copied, perhaps many times, until it reached the memory location you're inspecting. This is especially true in move-heavy Rust and C++ code. Pernosco detects copying through memory and registers and follows dataflow backwards through them, producing an explanation comprising multiple steps, any of which the user can inspect just by clicking on them. Thanks to omniscience, this is all very fast. (Jeff Muizelaar implemented something similar with scripting gdb and rr, which partially inspired us. Our infrastructure is a lot more powerful what he had to work with.)

Pernosco explanations terminate when you reach a point where a value was derived from something other than a CPU copy: e.g. an immediate value, I/O, or arithmetic. There's no particular reason why we need to stop there! For example, there is obviously scope to extend these explanations through arithmetic, to explore more general dataflow DAGs, though intelligible visualization would become more difficult.

Pernosco's dataflow explanations differ from what you get with gdb and rr in an interesting way: gdb deliberately ignores idempotent writes, i.e. writes to memory that do not actually change the value. We thought hard about this and decided that Pernosco should not ignore them. Consider a trivial example:

x = 0;
y = 0;
x = y;
If you set a watchpoint on x at the end and reverse-continue, gdb+rr will break on x = 0. We think this is generally not what you want, so a Pernosco explanation for x at the end will show x = y and y = 0. I don't know why gdb behaves this way, but I suspect it's because gdb watchpoints are sometimes implemented by evaluating the watched expression over time and noting when the value changes; since that can't detect idempotent writes, perhaps hardware watchpoints were made to ignore idempotent writes for consistency.

An interesting observation about our dataflow explanations is that although the semantics are actually quite subtle, even potentially confusing once you dig into them (there are additional subtleties I haven't gone into here!), users don't seem to complain about that. I'm optimistic that the abstraction we provide matches user intuitions closely enough that they skate over the complexity — which I think would be a pretty good result.

(One of my favourite moments with rr was when a Mozilla developer called a Firefox function during an rr replay and it printed what they expected it to print. They were about to move on, but then did a double-take, saying "Errrr ... what just happened?" Features that users take for granted but are actually mind-boggling are the best features.)