Tech Report CS-89-22

General Methods for the Analysis of the Maximum Size of Data Structures

Claire M. Kenyon-Mathieu and Jeffrey Scott Vitter

April 1989

Abstract:

We develop two new probabilistic methods that allow us, for several different dynamic processes, to analyze the maximum data-structure size encountered during a sequence of insertions and deletions, such as in data structures like priority queues, dictionaries, linear lists, and symbol tables, and in sweepline structures for geometry and VLSI applications. The notion of the ``maximum'' is basic to issues of resource preallocation. We apply our methods to several models of insertion and deletion, including combinatorial models of file histories (which include the data structures mentioned above) and probabilistic models. Included in our study is a non-Markovian process (algorithm) for processing sweepline information in an efficient way, called ``hashing with lazy deletion'' (HwLD).

We derive expressions for the expected maximum data-structure size that are asymptotically exact, that is, correct up to lower-order terms; in several cases of interest the expected value of the maximum size is asymptotically equal to the maximum expected size. This solves several open problems, including longstanding fundamental questions in queueing theory. Both of our approaches are robust and rely on novel applications of techniques from the analysis of algorithms. At a high level, our first method isolates the primary contribution to the maximum and bounds the lesser effects. In our second technique we relate the continuous-time probabilistic model to its discrete analogue --- the maximum slot occupancy in hashing.

(complete text in pdf)