Skip to main content

Optimal Parallel I/O for Range Queries through Replication

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2453))

Included in the following conference series:

Abstract

In this paper we study the problem of declustering twodimensional datasets with replication over parallel devices to improve range query performance. The related problem of declustering without replication has been well studied. It has been established that strictly optimal declustering schemes do not exist if data is not replicated. In addition to the usual problem of identifying a good allocation, the replicated version of the problem needs to address the issue of identifying a good retrieval schedule for a given query. We address both problems in this paper. An efficient algorithm for finding a lowest cost retrieval schedule is developed. This algorithm works for any query, not just range queries. Two replicated placement schemes are presented - one that results in a strictly optimal allocation, and another that guarantees a retrieval cost that is either optimal or 1 more than the optimal for any range query.

Portions of this work were supported by Grant EIA-9903545 from the National Science Foundation, Contract N00014-02-1-0364 from the Ofice of Naval Research, by sponsors of the Center for Education and Research in Information Assurance and Security, the GAANN fellowship, NSF CAREER grant IIS-9985019 and NSF Grant 9972883.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. A. S. Abdel-Ghaffar and A. El Abbadi. Optimal disk allocation for partial match queries. Transactions of Database Systems, 18(1):132–156, March 1993.

    Article  Google Scholar 

  2. K. A. S. Abdel-Ghaffar and A. El Abbadi. Optimal allocation of twodimensional data. In Int. Conf. on Database Theory, pages 409–418, Delphi, Greece, Jan. 1997.

    Google Scholar 

  3. M. J. Atallah and S. Prabhakar. (Almost) optimal parallel block access for range queries. In Proc. of the 19th ACM Symposium on Principles of Database Systems (PODS), Dallas, Texas, May 2000.

    Google Scholar 

  4. S. Berchtold, C. Bohm, B. Braunmuller, D. A. Keim, and H-P. Kriegel. Fast parallel similarity search in multimedia databases. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 1–12, Arizona, U.S.A., 1997.

    Google Scholar 

  5. R. Bhatia, R. K. Sinha, and C.-M. Chen. Declutering using golden ratio sequences. In Proc. of Int’l. Conference on Data Engineering (ICDE), San Diego, California, March 2000.

    Google Scholar 

  6. H. C. Du and J. S. Sobolewski. Disk allocation for cartesian product files on multiple-disk systems. ACM Trans of Database Systems, 7(1):82–101, 1982.

    Article  MATH  Google Scholar 

  7. C. Faloutsos and P. Bhagwat. Declustering using fractals. In Proc. of the 2nd Int. Conf. on Parallel and Distributed Information Systems, pages 18–25, San Diego, CA, Jan 1993.

    Google Scholar 

  8. C. Faloutsos and D. Metaxas. Declustering using error correcting codes. In Proc. ACM Symp. on Principles of Database Systems, pages 253–258, 1989.

    Google Scholar 

  9. J. Gray, B. Horst, and M. Walker. Parity striping of disc arrays: Low-cost reliable storage with acceptable throughput. In Proceedings of the Int. Conf. on Very Large Data Bases, pages 148–161, Washington DC., August 1990.

    Google Scholar 

  10. M. H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 173–182, Chicago, 1988.

    Google Scholar 

  11. Y.-L. Lo, K. A. Hua, and H. C. Young. A general multidimensional data allocation method for multicomputer database systems. In 8th Int. Conf. on Database and Expert Systems Applications, pages 357–66, Toulouse, France, September 1997.

    Google Scholar 

  12. J. Li, J. Srivastava, and D. Rotem. CMD: a multidimensional declustering method for parallel database systems. In Proceedings of the Int. Conf. on Very Large Data Bases, pages 3–14, Vancouver, Canada, August 1992.

    Google Scholar 

  13. S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. El Abbadi. Cyclic allocation of two-dimensional data. In Proc. of the International Conference on Data Engineering (ICDE’98), pages 94–101, Orlando, Florida, Feb1998.

    Google Scholar 

  14. S. Prabhakar, D. Agrawal, and A. El Abbadi. Data declustering for efficient range and similarity searching. In Proc. Multimedia Storage and Archiving Systems III, (SPIE symposium on Voice, Video, and Data Communications), Boston, Massachusetts, November 1998.

    Google Scholar 

  15. S. Prabhakar, D. Agrawal, and A. El Abbadi. Efficient disk allocation for fast similarity searching. In Proc. of the 10th Int. Sym. on Parallel Algorithms and Architectures (SPAA’98), pages 78–87, June 1998.

    Google Scholar 

  16. S. Prabhakar, D. Agrawal, and A. El Abbadi. Efficient retrieval of multidimensional datasets through parallel I/O. In Proc. of the 5th International Conference on High Performance Computing, (HiPC’98), 1998.

    Google Scholar 

  17. Rakesh K. Sinha, Randeep Bhatia, and Chung-Min Chen. Asymptotically optimal declustering schemes for range queries. In Proc. of 8th International Conference on Database Theory (ICDT), pages 144–158, London, UK, January 2001.

    Google Scholar 

  18. P. Sanders, S. Egner, and J. Korst. Fast concurrent access to parallel disks. In 11th ACM-SIAM Symposium on Discrete Algorithms, 2000.

    Google Scholar 

  19. A. Tosun and H. Ferhatosmanoglu. Optimal Parallel I/O Using Replication. OSU Technical Report OSU-CISRC-11/01-TR26, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Frikken, K., Atallah, M., Prabhakar, S., Safavi-Naini, R. (2002). Optimal Parallel I/O for Range Queries through Replication. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds) Database and Expert Systems Applications. DEXA 2002. Lecture Notes in Computer Science, vol 2453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46146-9_66

Download citation

  • DOI: https://doi.org/10.1007/3-540-46146-9_66

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44126-7

  • Online ISBN: 978-3-540-46146-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics