Eyes Above The Waves

Robert O'Callahan. Christian. Repatriate Kiwi. Hacker.

Friday 1 September 2017

rr Trace Portability

We want to be able to record an rr trace on one machine but copy it to another machine for replay. For example, you might record a failing test on one machine and copy the trace to a developer's machine for debugging. Or, you might record a failure locally and upload the trace to some cloud service for analysis. In short: on rr master, this works!

It turned out there were only two big issues to solve. We needed a way to make traces fully self-contained, because for efficiency we don't always copy all needed files into the trace during recording. rr pack addressed that. rr pack also compacts the trace by eliminating duplicate copies of the same file. Switching to brotli also reduced trace size, as did using Cap'n Proto for trace data.

The other big issue was handling CPUID instructions. We needed a way to ensure that during replay CPUID instructions returned the same results as they did during recording — they generally won't if you switch machines. Modern Intel hardware supports "CPUID faulting", i.e. you can configure the CPU to trap every time a CPUID instruction occurs. Linux didn't expose this capability to user-space, so last year Kyle Huey did the hard work of adding a Linux system-call API to expose it: the ARCH_GET/SET_CPUID subfeature of arch_prctl. It works very much like the existing PR_GET/SET_TSC, which give control over the faulting of RDTSC/RDTSCP instructions. Getting the feature into the upstream kernel was a bit of an ordeal, but that's a story for another day. It finally shipped in the 4.12 kernel.

When CPUID faulting is available, rr recording stores the results of all CPUID instructions in the trace, and rr replay intercepts all CPUID instructions and takes their results from the trace. With this in place, we're able to move traces from one machine/distro/kernel to another and replay them successfully.

We also support situations where CPUID faulting is not available on the recording machine but is on the replay machine. At the start of recording we save all available CPUID data (there are only a relatively small number of possible CPUID "leaves"), and then rr replay traps CPUID instructions and emulates them using the stored data.

Caveat: the user is responsible for ensuring the destination machine supports all instructions and other CPU features used by the recorded program. At some point we could add an rr feature to mask the CPUID values reported during recording so you can limit the CPU features a recorded program uses. (We actually already do this internally so that applications running under rr believe that RTM transactional memory and RDRAND, which rr can't handle, are not available.)

CPUID faulting is supported on most modern Intel CPUs, at least on Ivy Bridge and its successor Core architectures. Kyle also added support to upstream Xen and KVM to virtualize it, and even emulate it regardless of whether the underlying hardware supports it. However, VM guests running on older Xen or KVM hypervisors, or on other hypervisors, probably can't use it. And as mentioned, you will need a Linux 4.12 kernel or later.

Comments

Unknown
What about the instructions with the encoding that are not illegal on an old processor but mean something specific or even different on a new processor? Like the CET Endbranch instructions that reuse existing NOPs? And the TZCNT instruction that contains a BSF as part of it? Cross-generation replay is always going to be a problem unless you make sure to emulate everything that could be a problem in a semantically safe way.
Robert
I think ENDBRANCH isn't a problem because it only acts as a non-noop if you enable CET, and in the recording it won't be enabled. Likewise TZCNT isn't likely to be a problem because it only differs from BSF when the input operand is zero, in which case the BSF result is undefined so programs shouldn't be using the result, though I suppose they could. Instructions with undefined results can even, in principle, break replay on the same machine, but that isn't causing problems in practice, so I'm not too worried about it. Generally Intel is working pretty hard to ensure cross-generation compatibility, so I don't expect to see breaking semantic changes across processors that aren't gated on mode switches.