A heuristic file reorganization algorithm based on record clustering

Scheuermann, Peter; Park, Young Chul; Omiecinski, Edward

doi:10.1007/BF02219229

A heuristic file reorganization algorithm based on record clustering

Part I Computer Science
Published: September 1989

Volume 29, pages 428–447, (1989)
Cite this article

BIT Numerical Mathematics Aims and scope Submit manuscript

Peter Scheuermann¹,
Young Chul Park¹ &
Edward Omiecinski²

32 Accesses
4 Citations
Explore all metrics

Abstract

The problem of file organization which we consider involves altering the placement of records on pages of a secondary storage device. In addition, we want this reorganization to be done in-place, i.e., using the file's original storage space for the newly reorganized file. The motivation for such a physical change is to improve the database system's performance. For example, by placing frequently and jointly accessed records on the same page or pages, we can try to minimize the number of page accesses made in answering a set of queeries. The optimal assignment (or reassignment) of records to clusters is exactly what record clustering algorithms attempt to do. However, record clustering algorithms usually do not solve the entire problem, i.e., they do not specify how to efficiently reorganize the file to reflect the clustering assignment which they determine. Our algorithm is a companion to general record clustering algorithms since it actually transforms the file. The problem of optimal file reorganization isNP-hard. Consequently, our reorganization algorithm is based on heuristics. The algorithm's time and space requirements are reasonable and its solution is near optimal. In addition, the reorganization problem which we consider in this paper is similar to the problem of join processing when indexes are used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Index Maintenance Strategy and Cost Model for Extended Cluster Pruning

NV-Cleaning: An Efficient Segment Cleaning Scheme for a Log-Structured Filesystem with Hybrid Memory Architecture

The SP-tree: A Clustered Index Structure for Efficient Sequential Access

References

M. Jakobsson,Reducing block accesses in inverted files by partial clustering, Information Systems, Vol. 5, 1980, pp. 1–5.
Article Google Scholar
T. Merrett, Y. Kambayashi and H. Yasuura,Scheduling of page-fetches in join operations, VLDB Conference Proceeding, Cannes, France, 1981, pp. 488–498.
E. Omiecinski,Incremental file reorganization schemes, VLDB Conference Proceedings, Stockholm, Sweden, 1985, pp. 346–357.
E. Omiecinski and P. Scheuermann,A global approach to record clustering and file reorganization, inResearch and Development in Information Retrieval, ed. C. J. van Rijsbergen, Cambridge Press, 1984, pp. 201–219.
S. Pramanik and D. Ittner,Use of graph-theoretic models for optimal relational database accesses to perform join, ACM Tods, Vol. 10, No. 1, 1985, pp. 57–74.
Article Google Scholar
P. Scheuermann and M. Ouksel,Multidimensional B-trees for associative searching in database systems, Information Systems, Vol. 7, No. 2, 1982, pp. 123–137.
Article Google Scholar
T. J. Teory and J. P. Fry,Design of Database Structures, Prentice-Hall, Englewood Cliffs, NJ, 1982.
Google Scholar
C. Yu and C. Chen,Information system design: One query at a time, ACM SIGMOD Conference Proceedings, Austin, Texas, 1985, pp. 280–290.
C. Yu, K. Lam, M. Siu and C. Suen,Adaptive record clustering, ACM TODS, Vol. 10, No. 2, 1985, pp. 180–204.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science, Northwestern University, 60201, Evanston, Illinois, USA
Peter Scheuermann & Young Chul Park
School of Information & Computer Science, Georgia Institute of Technology, 30332, Atlanta, Georgia, USA
Edward Omiecinski

Authors

Peter Scheuermann
View author publications
You can also search for this author in PubMed Google Scholar
Young Chul Park
View author publications
You can also search for this author in PubMed Google Scholar
Edward Omiecinski
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

The research of this author was partially supported by the National Science Foundation under grant IST-8696157.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Scheuermann, P., Park, Y.C. & Omiecinski, E. A heuristic file reorganization algorithm based on record clustering. BIT 29, 428–447 (1989). https://doi.org/10.1007/BF02219229

Download citation

Received: 15 May 1988
Revised: 15 April 1989
Issue Date: September 1989
DOI: https://doi.org/10.1007/BF02219229

CR Categories

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A heuristic file reorganization algorithm based on record clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Index Maintenance Strategy and Cost Model for Extended Cluster Pruning

NV-Cleaning: An Efficient Segment Cleaning Scheme for a Log-Structured Filesystem with Hybrid Memory Architecture

The SP-tree: A Clustered Index Structure for Efficient Sequential Access

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

CR Categories

Subscribe and save

Buy Now

Navigation

A heuristic file reorganization algorithm based on record clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Index Maintenance Strategy and Cost Model for Extended Cluster Pruning

NV-Cleaning: An Efficient Segment Cleaning Scheme for a Log-Structured Filesystem with Hybrid Memory Architecture

The SP-tree: A Clustered Index Structure for Efficient Sequential Access

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

CR Categories

Subscribe and save

Buy Now

Search

Navigation