Tech Report CS-94-11

Optimal Tracing and Incremental Reexecution for Debugging Long-Running Programs

Robert H.B. Netzer and Mark H. Weaver

March 1994

Abstract:

Debugging requires execution replay. Locations of bugs are rarely known in advance, so an execution must be repeated over and over to track down bugs. A problem arises with repeated reexecution for long-running programs and programs that have complex interactions with their environment. Replaying long-running programs from the start incurs too much delay. Replaying programs that interact with their environment requires the difficult (and sometimes impossible) task of exactly reproducing this environment (such as the connections over a one-day period to an X server). We solve these problems by incremental

checkpointing and replay. By periodically checkpointing parts of the execution's state, it can be restarted from intermediate points, bounding the delay to replay any part of the execution and allowing parts of the execution to be skipped. We present adaptive tracing strategies that provide bounded-time incremental replay and that are nearly optimal. Our techniques track reads and writes to memory using space-efficient two-level bitvectors. Our implementation on a Sparc 10 traces less than 15 kilobytes/sec for CPU-intensive programs and for interactive programs the slowdown is low enough that tracing can be left on all the time.

(complete text in pdf or gzipped postscript)