Abstract
The problem of file organization which we consider involves altering the placement of records on pages of a secondary storage device. In addition, we want this reorganization to be done in-place, i.e., using the file's original storage space for the newly reorganized file. The motivation for such a physical change is to improve the database system's performance. For example, by placing frequently and jointly accessed records on the same page or pages, we can try to minimize the number of page accesses made in answering a set of queeries. The optimal assignment (or reassignment) of records to clusters is exactly what record clustering algorithms attempt to do. However, record clustering algorithms usually do not solve the entire problem, i.e., they do not specify how to efficiently reorganize the file to reflect the clustering assignment which they determine. Our algorithm is a companion to general record clustering algorithms since it actually transforms the file. The problem of optimal file reorganization isNP-hard. Consequently, our reorganization algorithm is based on heuristics. The algorithm's time and space requirements are reasonable and its solution is near optimal. In addition, the reorganization problem which we consider in this paper is similar to the problem of join processing when indexes are used.
Similar content being viewed by others
References
M. Jakobsson,Reducing block accesses in inverted files by partial clustering, Information Systems, Vol. 5, 1980, pp. 1–5.
T. Merrett, Y. Kambayashi and H. Yasuura,Scheduling of page-fetches in join operations, VLDB Conference Proceeding, Cannes, France, 1981, pp. 488–498.
E. Omiecinski,Incremental file reorganization schemes, VLDB Conference Proceedings, Stockholm, Sweden, 1985, pp. 346–357.
E. Omiecinski and P. Scheuermann,A global approach to record clustering and file reorganization, inResearch and Development in Information Retrieval, ed. C. J. van Rijsbergen, Cambridge Press, 1984, pp. 201–219.
S. Pramanik and D. Ittner,Use of graph-theoretic models for optimal relational database accesses to perform join, ACM Tods, Vol. 10, No. 1, 1985, pp. 57–74.
P. Scheuermann and M. Ouksel,Multidimensional B-trees for associative searching in database systems, Information Systems, Vol. 7, No. 2, 1982, pp. 123–137.
T. J. Teory and J. P. Fry,Design of Database Structures, Prentice-Hall, Englewood Cliffs, NJ, 1982.
C. Yu and C. Chen,Information system design: One query at a time, ACM SIGMOD Conference Proceedings, Austin, Texas, 1985, pp. 280–290.
C. Yu, K. Lam, M. Siu and C. Suen,Adaptive record clustering, ACM TODS, Vol. 10, No. 2, 1985, pp. 180–204.
Author information
Authors and Affiliations
Additional information
The research of this author was partially supported by the National Science Foundation under grant IST-8696157.