Tech Report CS-95-13

Fragments: A Mechanism for Low Cost Data Integration

Steven P. Reiss

May 1995

Abstract:

Control integration has become widely used as the primary means for combining a variety of programming tools in an environment. Data integration, on the other hand has not been nearly as sucessful. the primary reason for this is the high cost of data integration when it is viewed as a program databses. Both the complexity of the needed database system and the high cost of modifying all the programming tools to use the database have made data integration expensive. In this paper we propose a new means for data integration based on a database of fragments. Fragments are references to portions of a file. They have a source, type and associated attributes. The database and fragments are accessed through an object-oriented query language. This mechanism promises to provide most of the capabilities typically associated with data integration without the large costs. It allows the use of existing tools and multiple languages and can offer full support for thr software engineering process.

(complete text in pdf or gzipped postscript)