Abstract
In this paper we study the problem of declustering twodimensional datasets with replication over parallel devices to improve range query performance. The related problem of declustering without replication has been well studied. It has been established that strictly optimal declustering schemes do not exist if data is not replicated. In addition to the usual problem of identifying a good allocation, the replicated version of the problem needs to address the issue of identifying a good retrieval schedule for a given query. We address both problems in this paper. An efficient algorithm for finding a lowest cost retrieval schedule is developed. This algorithm works for any query, not just range queries. Two replicated placement schemes are presented - one that results in a strictly optimal allocation, and another that guarantees a retrieval cost that is either optimal or 1 more than the optimal for any range query.
Portions of this work were supported by Grant EIA-9903545 from the National Science Foundation, Contract N00014-02-1-0364 from the Ofice of Naval Research, by sponsors of the Center for Education and Research in Information Assurance and Security, the GAANN fellowship, NSF CAREER grant IIS-9985019 and NSF Grant 9972883.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
K. A. S. Abdel-Ghaffar and A. El Abbadi. Optimal disk allocation for partial match queries. Transactions of Database Systems, 18(1):132–156, March 1993.
K. A. S. Abdel-Ghaffar and A. El Abbadi. Optimal allocation of twodimensional data. In Int. Conf. on Database Theory, pages 409–418, Delphi, Greece, Jan. 1997.
M. J. Atallah and S. Prabhakar. (Almost) optimal parallel block access for range queries. In Proc. of the 19th ACM Symposium on Principles of Database Systems (PODS), Dallas, Texas, May 2000.
S. Berchtold, C. Bohm, B. Braunmuller, D. A. Keim, and H-P. Kriegel. Fast parallel similarity search in multimedia databases. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 1–12, Arizona, U.S.A., 1997.
R. Bhatia, R. K. Sinha, and C.-M. Chen. Declutering using golden ratio sequences. In Proc. of Int’l. Conference on Data Engineering (ICDE), San Diego, California, March 2000.
H. C. Du and J. S. Sobolewski. Disk allocation for cartesian product files on multiple-disk systems. ACM Trans of Database Systems, 7(1):82–101, 1982.
C. Faloutsos and P. Bhagwat. Declustering using fractals. In Proc. of the 2nd Int. Conf. on Parallel and Distributed Information Systems, pages 18–25, San Diego, CA, Jan 1993.
C. Faloutsos and D. Metaxas. Declustering using error correcting codes. In Proc. ACM Symp. on Principles of Database Systems, pages 253–258, 1989.
J. Gray, B. Horst, and M. Walker. Parity striping of disc arrays: Low-cost reliable storage with acceptable throughput. In Proceedings of the Int. Conf. on Very Large Data Bases, pages 148–161, Washington DC., August 1990.
M. H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 173–182, Chicago, 1988.
Y.-L. Lo, K. A. Hua, and H. C. Young. A general multidimensional data allocation method for multicomputer database systems. In 8th Int. Conf. on Database and Expert Systems Applications, pages 357–66, Toulouse, France, September 1997.
J. Li, J. Srivastava, and D. Rotem. CMD: a multidimensional declustering method for parallel database systems. In Proceedings of the Int. Conf. on Very Large Data Bases, pages 3–14, Vancouver, Canada, August 1992.
S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. El Abbadi. Cyclic allocation of two-dimensional data. In Proc. of the International Conference on Data Engineering (ICDE’98), pages 94–101, Orlando, Florida, Feb1998.
S. Prabhakar, D. Agrawal, and A. El Abbadi. Data declustering for efficient range and similarity searching. In Proc. Multimedia Storage and Archiving Systems III, (SPIE symposium on Voice, Video, and Data Communications), Boston, Massachusetts, November 1998.
S. Prabhakar, D. Agrawal, and A. El Abbadi. Efficient disk allocation for fast similarity searching. In Proc. of the 10th Int. Sym. on Parallel Algorithms and Architectures (SPAA’98), pages 78–87, June 1998.
S. Prabhakar, D. Agrawal, and A. El Abbadi. Efficient retrieval of multidimensional datasets through parallel I/O. In Proc. of the 5th International Conference on High Performance Computing, (HiPC’98), 1998.
Rakesh K. Sinha, Randeep Bhatia, and Chung-Min Chen. Asymptotically optimal declustering schemes for range queries. In Proc. of 8th International Conference on Database Theory (ICDT), pages 144–158, London, UK, January 2001.
P. Sanders, S. Egner, and J. Korst. Fast concurrent access to parallel disks. In 11th ACM-SIAM Symposium on Discrete Algorithms, 2000.
A. Tosun and H. Ferhatosmanoglu. Optimal Parallel I/O Using Replication. OSU Technical Report OSU-CISRC-11/01-TR26, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Frikken, K., Atallah, M., Prabhakar, S., Safavi-Naini, R. (2002). Optimal Parallel I/O for Range Queries through Replication. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds) Database and Expert Systems Applications. DEXA 2002. Lecture Notes in Computer Science, vol 2453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46146-9_66
Download citation
DOI: https://doi.org/10.1007/3-540-46146-9_66
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44126-7
Online ISBN: 978-3-540-46146-3
eBook Packages: Springer Book Archive