Tech Report CS-94-29

Efficient Parallelism vs Reliable Distribution: a Trade-off for Concurrent Computations

Paris C. Kanellakis, Dimitris Michailidis, and Alex A. Shvartsman

June 1994

Abstract:

Concurrent computations should combine efficiency with reliability, where efficiency is usually associated with parallel and reliability with distributed computing. Such a desirable combination is not always possible, because of an intuitive trade-off: efficiency requires removing redundancy from computations whereas reliability requires some redundancy. We survey a spectrum of algorithmic models (from fail-stop, synchronous to asynchronous and from approximate to exact computations) in which reliability is guaranteed with small trade-offs in efficiency. We illustrate a number of cases where optimal trade-offs are achievable. A basic property of all these models, which is of some interest in the study of concurrency, is that ``true'' read/write concurrency is necessary for fault tolerance. In particular, we outline how algorithms can be designed so that, in each execution, the total ``true'' concurrency used can be closely related to the faults that can be tolerated.

(complete text in pdf or gzipped postscript)