Tech Report CS-93-15

Optimal Tracing and Replay for Debugging Shared-Memory Parallel Programs

Robert H.B. Netzer

April 1993

Abstract:

Execution replay is a crucial part of debugging. Because explicitly parallel shared-memory programs can be nondeterministic, a tool is required that traces executions so they can be replayed for debugging. We present an adaptive tracing strategy that is optimal and records the minimal number of shared-memory references required to exactly replay executions. Our algorithm makes run-time tracing decisions by detecting and tracing a certain type of race condition on-the-fly. Unlike past schemes, we make no assumptions about the execution's correctness (it need not be race free). Experiments show that only 0.01 - 2% of the shared-memory references are usually traced, a 2 - 4 order of magnitude reduction over past techniques which trace every access.

(complete text in pdf or gzipped postscript)