Abstract
This paper is concerned with the allocation of multi-attribute records on several disks so as to achieve high degree of concurrency of disk access when responding to partial match queries.
An algorithm to distribute a set of multi-attribute records onto different disks is presented. Since our allocation method will use the principal component analysis, this concept is first introduced. We then use it to generate a set of real numbers which are the projections on the first principal component direction and can be viewed as hashing addresses.
Then we propose an algorithm based upon these hashing addresses to allocate multi-attribute records onto different disks. Some experimental results show that our method can indeed be used to solve the multi-disk data allocation problem for concurrent accessing.
Similar content being viewed by others
References
M. Y. Chan,Multidisk file design: An analysis of folding buckets to disks, BIT, Vol. 24, 1984, pp. 262–268.
M. Y. Chan,A note on redundant disk modulo allocation, Information Processing Letters, Vol. 20, April 1985, pp. 121–123.
C. C. Chang and C. Y. Chen,Lower bounds of using disk modulo allocation method to allocate cartesian product files in a two-disk system, Proc. ICS 1986 Conference, Tainan, Taiwan, Dec. 1986, pp. 770–774.
C. C. Chang and C. Y. Chen,Performance analysis of the generalized disk modulo allocation method for multiple key hashing files on multi-disk systems, to appear in the Computer Journal, United Kingdom.
C. C. Chang and C. Y. Chen,Performance of two-disk partition data allocations, BIT, Vol. 27, 1987, pp. 306–314.
C. C. Chang, M. D. Hsiao, and C. H. Lin,Algorithms to allocate a file for concurrent disk accessing, Proceedings of the 6th Advanced Database System Symposium, Tokyo, Japan, August 1986, pp. 201–205.
C. C. Chang and L. S. Liang,On strict optimality property of allocating binary Cartesian product files on multiple disk systems, Proceedings of the International Conference on Foundation of Data Organization, Kyoto, Japan, May 1985, pp. 104–112.
C. C. Chang and Y. L. Lu,The complexity of multi-disk data allocation problem, Proc. NCS 1985 Conference, Kaoshung, Taiwan, Dec. 1985, pp. 468–471.
C. C. Chang and J. J. Shen,tPerformance analysis of the disk modulo allocation method for concurrent accessing on multiple disk systems, Journal of the Chinese Institute of Engineers, Vol. 8, No. 3, pp. 271–283.
C. C. Chang and J. J. Shen,Consecutive retrieval organization as a file allocation scheme on multiple disk systems, the Proceedings of the International Conference on Foundations of Data Organization, Kyoto, Japan, May 1985, pp. 74–80.
Y. T. Chien and K. S. Fu,On the generalized Karhunen-Loeve expansion, IEEE Transactions on Information Theory, Vol. IT-13, 1967, pp. 518–520.
H. C. Du,Disk allocation methods for binary Cartesian product files, BIT, Vol. 26, 1986, pp. 138–147.
H. C. Du and J. S. Sobolewski,Disk allocation for Cartesian product files on multiple disk systems, ACM Trans. Database Systems, Vol. 7, March 1982, pp. 82–101.
M. T. Fang, R. C. T. Lee, and C. C. Chang,The idea of de-clustering and its applications, 12th International Conference on Very Large Data Bases, Kyoto, Japan, August 1986, pp. 181–188.
K. S. Fu,Sequential Methods in Pattern Recognition and Machine Learning, Academic Press, Reading, New York, 1968.
S. P. Ghosh,Data Base Organization for Data Management, Academic Press, Reading, New York, 1977.
R. C. T. Lee, Y. H. Chin, and S. C. Chang,Application of principal component analysis to multikey searching, IEEE Transactions on Software Engineering, Vol. SE-2, No. 3, September 1976, pp. 185–193.
T. S. Yuen and H. C. Du,Dynamic file structure for partial match retrieval based on overflow bucket sharing, IEEE Transactions on Software Engineering, Vol. SE-12, No. 8, August 1986, pp. 801–810.