Abstract
Partitioning and allocation of relations is an important component of the distributed database design. Several approaches (and algorithms) have been proposed for clustering data for pattern classification and for partitioning relations in distributed databases. Most of the approaches used for classification use square-error criterion. In contrast, most of the approaches proposed for partitioning of relations are eitherad hoc solutions or solutions for special cases (e.g., binary vertical partitioning).
In this paper, we first highlight the differences between the approaches taken for pattern classification and for distributed databases. Then an objective function for vertical partitioning of relations is derived using the square-error criterion commonly used in data clustering. The objective function derived generalizes and subsumes earlier work on vertical partitioning. Furthermore, the approach proposed in this paper is shown to be useful for comparing previously developed algorithms for vertical partitioning. The objective function has also been extended to include additional information, such as transaction types, different local and remote accessing costs and replication. Finally, we discuss the implementation of a distributed database design testbed.
Similar content being viewed by others
References
P. M. G. Apers. Data Allocation in Distributed Database Systems.ACM Transactions on Database Systems, Vol. 13, No. 3, pp. 263–304, Dec. 1988.
M. Babad. A Record and File Partitioning Model.Communications of ACM, Vol. 20, No. 1, pp. 22–30, Jan. 1977.
S. Ceri, S.B. Navathe, G. Weiderhold. Distribution Design of Logical Database Schemas.IEEE Transactions on Software Engineering, Vol SE-9, No. 4, pp. 487–503, July 1983.
S. Ceri, S. Pernici, and G. Weiderhold. Optimization Problems and Solution Methods in the Design of Data Distribution.Information Sciences, Vol 69, No. 3, pp. 261–272, Sept. 1989.
P. Chu. A Transaction-Oriented Approach to Attribute Partitioning.Information Systems, Vol. 17, No. 4, pp. 329–342, Dec. 1992.
D. Cornell, and P. Yu. A Vertical Partitioning Algorithm for Relational Databases.Proceedings of the Third International Conference on Data Engineering, pp. 30–35, Feb. 1987.
M. Eisner, and D. Severance. Mathematical Techniques for Efficient Record Segmentation in Large Shared Databases.Journal of Association for Computing Machinery, Vol. 23, No. 4, pp. 619–635, Oct. 1976.
M. Hammer, and B. Niamir. A Heuristic Approach to Attribute Partitioning.In Proceedings of ACM SIGMOD International Conference on Management of Data, Boston, MA, pp. 93–101, May 1979.
J. Hoffer. An Integer Programming Formulation of Computer Database Design Problems.Information Sciences, Vol. 56, No. 1, pp. 29–48, March 1976.
J. Hoffer, and D. Severance. The Uses of Cluster Analysis in Physical Database DesignIn Proceedings of the 1st International Conference on Very Large Databases, Framingham, MA, pp. 69–86, August 1975.
A. Jain, and R. Dubes.Algorithms for Clustering Data. Prentice Hall Advanced Reference Series, Englewood Cliffs, NJ, 1988.
J. Kittler. A Locally Sensitive Method for Cluster Analysis.Pattern Recognition, Vol. 8, No. 1, pp. 22–33, Jan. 1976.
S. Lu, and K. Fu. A Sentence-to-Sentence Clustering Procedure for Pattern Analysis.IEEE Transactions on Systems, Man and Cybernetics, Vol. SMC-8, No. 4, pp. 381–389, Oct. 1978.
W. McCormick, P. Schweitzer, and T. White. Problem Decomposition and Data Reorganization by a Clustering Technique.Operations Research, Vol. 20, No. 4, pp. 741–751, July 1972.
S. March, and D. Severance. The Determination of Efficient Record Segmentation and Blocking Factors for Shared Data Files.ACM Transactions on Database Systems Vol. 2, No. 3, pp. 279–296, Sept. 1977.
R. Muthuraj. A Formal Approach To The Vertical Partitioning Problem In Distributed Database Design. M.S. Thesis,Computer and Information Science Department, University of Florida, Gainesville, August 1992.
R. Muthuraj, S. Chakravarthy, R. Varadarajan, and S. B. Navathe. A Formal Approach To The Vertical Partitioning Problem In Distributed Database Design.In Proceedings of Parallel and Distributed Information Systems (PDIS-2), San Diego, pp. 26–34, Jan 1993.
S. Navathe, S. Ceri, G. Wiederhold, and J. Dou. Vertical Partitioning Algorithms for Database DesignACM Transactions on Database Systems, Vol. 9, No. 4, pp. 680–710, Dec. 1984.
S. Navathe and M. Ra. Vertical Partitioning For Database Design: A Graphical Algorithm.In Proceedings of ACM SIGMOD International Conference on Management of Data, Portland, Oregon, pp. 440–450, June 1989.
M. Schkolnic. A Clustering Algorithm for Hierarchical Structures,ACM Transactions on Database Systems, Vol. 2, No. 1, pp. 27–44, March 1977.
E. Shaffer, R. Dubes, and A. Jain. Single-Link Characteristics of a Mode-Seeking Algorithm.Pattern Recognition, Vol. 11, No. 1, pp. 65–73, Jan 1979.
A. Torn. Cluster Analysis Using Seed Points and Density-Determined Hyperspheres as an Aid to Global Optimization.IEEE Transactions on Systems, Man and Cybernetics, Vol. SMC-7, No. 8, pp. 610–616, Oct. 1977.
C. Zahn. Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters.IEEE Transactions on Computers, Vol. C-20, No. 1, pp. 68–86, Jan 1971.
Author information
Authors and Affiliations
Additional information
Recommended by: A. Sheth
Rights and permissions
About this article
Cite this article
Chakravarthy, S., Muthuraj, J., Varadarajan, R. et al. An objective function for vertically partitioning relations in distributed databases and its analysis. Distrib Parallel Databases 2, 183–207 (1994). https://doi.org/10.1007/BF01267326
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01267326