Friday, 30 June 2017

Patch On Linux Kernel Stable Branches Breaks rr

A change in 4.12rc5 breaks rr. We're trying to get it fixed before 4.12 is released, and I think that will be OK. Unfortunately that change has already been backported to 3.18.57, 4.4.72, 4.9.32 and 4.11.5 :-( (all released on June 14, and I guess arriving in distros a bit later). Obviously we'll try to get the 4.12 fix also backported to those branches, but that will take a little while.

The symptoms are that long, complex replays fail with "overshot target ticks=... by " where N is generally a pretty large number (> 1000). If you look in the trace file, the value N will usually be similar to the difference between the target ticks and the previous ticks value for that task --- i.e. we tried to stop after N ticks but we actually stopped after about N*2 ticks. Unfortunately, rr tests don't seem to be affected.

I'm not sure if there's a reasonable workaround we can use in rr, or if there is one, whether it's worth effort to deploy. That may depend on how the conversation with upstream goes.

No comments:

Post a comment