Wednesday, 22 March 2017

Blogging Vs Academic Publishing

Adrienne Felt asked on Twitter:

academic publishing is too onerous & slow. i'm thinking about starting a blog to share chrome research instead. thoughts?
It depends on one's goals. I think if one's primary goal is to disseminate quality research to a broad audience, then write papers, publish them to arXiv, and update them in response to comments.

I've thought about this question a fair bit myself, as someone who's interacted with the publication system over two decades as a writer and reviewer but who won't perish if I don't publish. (Obviously if you're an academic you must publish and you can stop reading now...) Academic publishing has many problems which I won't try to enumerate here, but my main issues with it are the perverse incentives it can create and the erratic nature of the review process. It does have virtues...

Peer review is the most commonly cited virtue. In computer science at least, I think it's overrated. A typical paper might be reviewed by one or two experts and a few other reviewers who are capable, but unfamiliar with the technical details. Errors are easy to miss, fraud even easier. Discussions about a paper after it has been published are generally much more interesting than the reviews, because you have a wider audience who collectively bring more expertise and alternative viewpoints. Online publishing and discussion could be a good substitute if it's not too fragmented (I dislike the way Hackernews, Reddit etc fragment commentary), and if substantive online comments are used to improve the publication. Personally I try to update my blog posts when I get particularly important comments; that has the problem that fewer people read the updated versions, but at least they're there for later reference. It would be good if we had a way to archive comments like we do for papers.

Academic publishing could help identify important work. I don't think it's a particularly good tool for that. We have many other ways now to spread the word about important work, and I don't think cracking open the latest PLDI proceedings for a read-through has ever been an efficient use of time. Important work is best recognized months or years after publication.

The publishing system creates an incentive for people to take the time to systematically explain and evaluate their work and creates community standards for doing so. That's good, and it might be that if academic publishing wasn't a gatekeeper then over time those benefits would be lost. Or, perhaps the evaluation procedures within universities would maintain them.

Academic publishing used to be important for archival but that is no longer an issue.

Personally, the main incentives for me to publish papers now are to get the attention of academic communities I would otherwise struggle to reach, and to have fun hanging out with them at conferences. The academic community remains important to me because it's full of brilliant people (students and faculty) many of whom are, or will be, influential in areas I care about.

Thoughts On "Java and Scala’s Type Systems are Unsound" And Fuzz Testing

I just belatedly discovered Java and Scala’s Type Systems are Unsound from OOPSLA last year. It's a lovely paper. I would summarize it as "some type system soundness arguments depend on 'nonsense types' (e.g. generic types with contradictory bounds on type parameters) having no instances; if 'null' values can inhabit those types, those arguments are invalid". Note that the unsoundness does not lead to memory safety issues in the JVM.

The paper is a good illustration of how unexpected feature interactions can create problems for type systems even when a feature doesn't seem all that important at the type level.

The paper also suggests (implicitly) that Java's type system has fallen into a deep hole. Even without null, the interactions of subtyping, generics and wildcards are immensely complicated. Rust's rejection of subtyping (other than for lifetimes, which are tightly restricted) causes friction for developers coming from languages where subtyping is ubiquitous, but seems very wise for the long run.

I think the general issue shown in this paper could arise in other contexts which don't have 'null'. For example in a lazy language you can create a value of any type by calling a function that diverges. In a language with an explicit Option type, if T is a nonsense type then Option is also presumably a nonsense type but the value None inhabits it.

The paper discusses some methodological improvements that might detect this sort of mistake earlier. One approach it doesn't mention is fuzz testing. It seems to me that the examples in the paper are small enough to be found by fuzz testing techniques searching for programs which typecheck but contain obviously unsound constructs (e.g. a terminating function which can cast its parameter value to any type). Checking soundness via fuzz testing has been done to small extent with Rust (see paper) but I think more work in that direction would be fruitful.

Tuesday, 21 March 2017

Deterministic Hardware Performance Counters And Information Leaks

Summary: Deterministic hardware performance counters cannot leak information between tasks, and more importantly, virtualized guests.

rr relies on hardware performance counters to help measure application progress, to determine when to inject asynchronous events such as signal delivery and context switches. rr can only use counters that are deterministic, i.e., executing a particular sequence of application instructions always increases the counter value by the same amount. For example rr uses the "retired conditional branches" (RCB) counter, which always returns exactly the number of conditional branches actually retired.

rr currently doesn't work in environments such as Amazon's cloud, where hardware performance counters are not available to virtualized guests. Virtualizing hardware counters is technically possible (e.g. rr works well in Digital Ocean's KVM guests), but for some counters there is a risk of leaking information about other guests, and that's probably one reason other providers haven't enabled them.

However, if a counter's value can be influenced by the behavior of other guests, then by definition it is not deterministic in the sense above, and therefore it is useless to rr! In particular, because the RCB counter is deterministic ("proven" by a lot of testing), we know it does not leak information between guests.

I wish Intel would identify a set of counters that are deterministic, or at least free of cross-guest information leaks, and Amazon and other cloud providers would enable virtualization of them.

Thursday, 16 March 2017

Using rr To Debug Go Programs

rr seems to work well with Go programs, modulo some limitations of using gdb with Go.

The DWARF debuginfo produced by the standard Go compiler is unsatisfactory, because it does not include local variables held in registers. Instead, build with gccgo.

Some Go configurations generate statically linked executables. These work under rr, but they're slow to record and replay because our syscall-interception library is not loaded. When using rr with Go, be sure to generate dynamically linked executables.

Running Go tests using go test by default builds your project and then runs the tests. Recording the build step is wasteful, especially when the compiler is statically linked (see above). So build the test executable using go test -compiler gccgo -c and then run the test executable directly under rr.