Skip to main content

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 362))

Abstract

Problems whose data are too large to fit into main memory are called out-of-core problems. Out-of-core parallel-I/O algorithms can handle much larger problems than in-memory variants and have much better performance than single-device variants. However, they are not commonly used—partly because the understanding of them is not widespread. Yet such algorithms ought to be growing in importance because they address the needs of users with ever-growing problem sizes and ever-increasing performance needs.

This paper addresses this lack of understanding by presenting an introduction to the data-transfer models on which most of the out-of-core parallel-I/O algorithms are based, with particular emphasis on the Parallel Disk Model. Sample algorithms are discussed to demonstrate the paradigms (algorithmic techniques) used with these models.

Our aim is to provide insight into both the paradigms and the particular algorithms described, thereby also providing a background for understanding a range of related solutions. It is hoped that this background would enable the appropriate selection of existing algorithms and the development of new ones for current and future out-of-core problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Aggarwal, B. Alpern, A. K. Chandra, and M. Snir. A model for hierarchical memory. Technical report RC 15118, IBM Watson Research Center, October 1989. An earlier version appeared in Proceedings of Nineteenth Annual ACM Symposium on Theory of Computing, pages 305–314, New York, NY, May 1987.

    Google Scholar 

  2. A. Aggarwal, A. Chandra, and M. Snir. On communication latency in PRAM computations. In Proceedings of the 1989 ACM Symposium on Parallel Algorithms and Architectures, pages 11–21, Santa Fe, NM, June 1989.

    Chapter  Google Scholar 

  3. Alok Aggarwal and C. Greg Plaxton. Optimal parallel sorting in multi-level storage. In Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 659–668, Arlington, VA, January 1994.

    Google Scholar 

  4. Alok Aggarwal and Jeffrey Scott Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31 (9): 1116–1127, September 1988.

    Article  MathSciNet  Google Scholar 

  5. A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.

    MATH  Google Scholar 

  6. B. Alpern, L. Carter, and E. Feig. Uniform memory hierarchies. In Proceedings of 31st Annual IEEE Symposium on Foundations of Computer Science, St. Louis, MO, October 1990.

    Google Scholar 

  7. B. Alpern, L. Carter, E. Feig, and T. Selker. The uniform memory hierarchy model of computation. Algorithmica, 12(2/3):72–109, August and September 1994.

    Article  MathSciNet  MATH  Google Scholar 

  8. B. Alpern, L. Carter, and J. Ferrante. Modeling parallel computers as memory hierarchies. In W. K. Giloi, S. Jahnichen, and B. D. Shriver, editors, Working Conference on Massively Parallel Programming Models, pages 116–123, Berlin, Germany, September 1993.

    Chapter  Google Scholar 

  9. L. Arge, D. E. Vengroff, and J. S. Vitter. External-memory algorithms for processing line segments in geographic information systems. In Proceedings of 3rd Annual European Symposium of Algorithms, Lecture Notes in Computer Science, number 979, pages 295–310, Corfu, Greece, September 1995. Springer-Verlag.

    Google Scholar 

  10. Lars Arge. The buffer tree: A new technique for optimal I/O-algorithms. In 4th International Workshop on Algorithms and Data Structures (Proceedings), Lecture Notes in Computer Science, number 955, pages 334–345, Kingston, Canada, August 1995. Springer-Verlag.

    Google Scholar 

  11. Lars Arge, Mikael Knudsen, and Kirsten Larsen. A general lower bound on the I/O-complexity of comparison-based algorithms. In Proceedings of the 3rd Workshop of Algorithms and Data Structures, number 709 in Lecture Notes in Computer Science, pages 83–94. Springer-Verlag, August 1993.

    Google Scholar 

  12. Micah Beck, Dina Bitton, and W. Kevin Wilkinson. Sorting large files on a backend multiprocessor. IEEE Transactions on Computers, (7):769–778, July 1988.

    Google Scholar 

  13. Dina Bitton, David J. De Witt, David K. Hsiao, and Jaishankar Menon. A taxonomy of parallel sorting. Computing Surveys, (3):287–318, September 1984.

    Google Scholar 

  14. Peter Brezany, Thomas A. Mueck, and Erich Schikuta. Language, compiler and parallel database support for I/O intensive applications. In High Performance Computing and Networking 1995 Europe, pages 14–20, Lecture Notes in Computer Science, number 919, May 1995. Springer-Verlag.

    Chapter  Google Scholar 

  15. Jean-Philippe Bruner, Palle Pedersen, and S. Lennart Johnsson. Load-balanced LU and QR factor and solve routines for scalable processors with scalable I/O. In Proceedings of the 17th IMACS World Congress, July 1994.

    Google Scholar 

  16. P. Chen, G. Gibson, R. H. Katz, D. A. Patterson, and M. Schulze. Two papers on RAIDs. UCB/CSD 88 479, University of California at Berkeley, December 1988.

    Google Scholar 

  17. Y.-J. Chiang. Experiments on the practical I/O efficiency of geometric algorithms: Distribution sweep vs. plane sweep. In 4th International workshop on Algorithms and Data Structures (Proceedings), Lecture Notes in Computer Science, number 955, Kingston, Canada, August 1995. Springer-Verlag.

    Google Scholar 

  18. Yi-Jen Chiang, Michael T. Goodrich, Edward F. Grove, Roberto Tamassia, Darren Erik Vengroff, and Jeffrey Scott Vitter. External-memory graph algorithms. In Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 139–149, San Francisco, CA, January 1995.

    Google Scholar 

  19. Mark J. Clement and Michael J. Quinn. Overlapping computations, communications and I/O in parallel sorting. Journal of Parallel and Distributed Computing, 28:162–172, August 1995.

    Article  MATH  Google Scholar 

  20. Thomas H. Cormen. Virtual Memory for Data-Parallel Computing. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 1992. Available as Technical Report MIT/LCS/TR-559.

    Google Scholar 

  21. Thomas H. Cormen. Fast permuting on disk arrays. Journal of Parallel and Distributed Computing, 17(1–2):41–57, January and February 1993.

    MATH  Google Scholar 

  22. Thomas H. Cormen and Alex Colvin. ViC*: A preprocessor for virtual-memory C*. Technical Report PCS-TR94–243, Dartmouth College Department of Computer Science, November 1994.

    Google Scholar 

  23. Thomas H. Cormen and David Kotz. Integrating theory and practice in parallel file systems. Technical Report PCS-TR93–188, Dartmouth College Department of Computer Science, September 1994. Earlier version appeared in Proceedings of the 1993 DAGS/PC Symposium, Hanover, NH, pages 64–74, June 1993.

    Google Scholar 

  24. Thomas H. Cormen, Thomas Sundquist, and Leonard F. Wisniewski. Asymptotically tight bounds for performing BMMC permutations on parallel disk systems. Technical Report PCS-TR94–223, Dartmouth College Department of Computer Science, July 1994. Extended abstract appeared in Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, Velen, Germany, June 1993.

    Google Scholar 

  25. Scott R. Cushman. A multiple discrete pass algorithm on a DEC Alpha 2100. Technical Report PCS-TR95–259, Dartmouth College Department of Computer Science, June 1995.

    Google Scholar 

  26. Tom Cwik, Jean Patterson, and David Scott. Electromagnetic scattering calculations on the Intel Touchstone Delta. In Proceedings of Supercomputing’92, pages 538–542, November 1992.

    Chapter  Google Scholar 

  27. R. Cypher and J. L. C. Sanz. Cubesort: A parallel algorithm for sorting N data items with S-sorters. Journal of Algorithms, 13(2): 211–234, June 1992.

    Article  MathSciNet  MATH  Google Scholar 

  28. Robert Cypher and C. Greg Plaxton. Deterministic sorting in nearly logarithmic time on the hypercube and related computers. Journal of Computer and System Sciences, 47, 1993.

    Google Scholar 

  29. David J. De Witt, Jeffrey F. Naughton, and Donovan A. Schneider. Parallel sorting on a shared-nothing architecture using probabilistic splitting. In Proceedings of the First International Conference on Parallel and Distributed Information Systems, pages 280–291, Miami Beach, FL, December 1991.

    Chapter  Google Scholar 

  30. Shimon Even. Parallelism in tape-sorting. Communications of the ACM, 17(4):202–204, April 1974.

    Article  MathSciNet  MATH  Google Scholar 

  31. Dror G. Feitelson, Peter F. Corbett, Yarsun Hsu, and Jean-Pierre Prost. Parallel I/O systems and interfaces for parallel computers. In C.-L. Wu, editor, Multiprocessor Systems—Design and Integration. World Scientific, 1996. To appear.

    Google Scholar 

  32. Dror G. Feitelson, Peter F. Corbett, Sandra Johnson, and Yarsun Hsu. Satisfying the I/O requirements of massively parallel supercomputers. Technical Report RC 19008 (83016), IBM Watson Research Center, July 1993.

    Google Scholar 

  33. R. W. Floyd. Permuting information in idealized two-level storage. In R. Miller and J. Thatcher, editors, Complexity of Computer Calculations, pages 105–109. Plenum Press, New York, 1972.

    Google Scholar 

  34. N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing’93, pages 462–471, 1993.

    Google Scholar 

  35. Garth A. Gibson. Redundant Disk Arrays: Reliable, Parallel Secondary Storage. ACM Distinguished Dissertations. MIT Press, 1992.

    Google Scholar 

  36. M. H. Goodrich, J.-J. Tsay, D. E. Vengroff, and J. S. Vitter. External-memory computational geometery. In Proceedings of the 34th Annual Symposium on Foundations of Computer Science, pages 714–723, Palo Alto, CA, November 1993.

    Google Scholar 

  37. Robert Y. Hou and Yale N. Patt. Comparing rebuild algorithms for mirrored and RAID5 disk arrays. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 317–326. ACM, May 1993.

    Chapter  Google Scholar 

  38. K. Klimkowski and R. A. van de Geijn. Anatomy of a parallel out-of-core dense linear solver. In Proceedings of the 1995 International Conference on Parallel Processing, pages 111:29–33, Oconomowoc, WI, August 1995.

    Google Scholar 

  39. D. Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching. Addison Wesley, Reading, MA, 1973.

    Google Scholar 

  40. David Kotz. Disk-directed I/O for MIMD multiprocessors. In Proceedings of the First USENIX Symposium on Operating Systems Design and Implementation, pages 61–1 A, Monterey, California, November 1994. Updated as Dartmouth TR PCS-TR94–226 on November 8, 1994.

    Google Scholar 

  41. David Kotz. Disk-directed I/O for an out-of-core computation. In Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing, pages 159–166, Pentagon City, Virginia, August 1995.

    Chapter  Google Scholar 

  42. Jang Sun Lee, Sanjay Ranka, and Ravi V. Shankar. Communication-efficient and memory-bounded external redistribution. Technical report, Syracuse University, 1995.

    Google Scholar 

  43. T. Leighton. Tight bounds on the complexity of parallel sorting. In IEEE Transactions of Computers, pages 344–354, April 1985.

    Google Scholar 

  44. Sean S.B. Moore and Leonard F. Wisniewski. Complexity analysis of two permutations used by fast cosine transform algorithms. Technical Report Technical Report PCS-TR95–266, Dartmouth College Department of Computer Science, October 1995.

    Google Scholar 

  45. M. H. Nodine, M. T. Goodrich, and J. S. Vitter. Blocking for external graph searching. In Proceedings of the ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems, pages 222–232, Washington, DC, May 1993.

    Google Scholar 

  46. Mark H. Nodine and Jeffrey S. Vitter. Greed sort: Optimal deterministic sorting on parallel disks. Journal of the ACM, 42:919–933, 1995.

    Article  MathSciNet  Google Scholar 

  47. Mark H. Nodine and Jeffrey Scott Vitter. Optimal deterministic sorting in parallel memory hierarchies. Technical Report CS-92–38, Brown University, August 1992.

    Google Scholar 

  48. Mark H. Nodine and Jeffrey Scott Vitter. Optimal deterministic sorting on parallel disks. Technical Report CS-92–08, Brown University, August 1992.

    Google Scholar 

  49. Mark H. Nodine and Jeffrey Scott Vitter. Deterministic distribution sort in shared and distributed memory multiprocessors. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 120–129, Velen, Germany, June 1993.

    Chapter  Google Scholar 

  50. David A. Patterson, Garth Gibson, and Randy H. Katz. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the ACM-SIGMOD International Conference on Management of Data, pages 109–116, Chicago, IL, June 1988.

    Google Scholar 

  51. Markus Pawlowski and Rudolf Bayer. Parallel sorting of large data volumes on distributed memory multiprocessors. In Parallel Computer Architectures: Theory, Hardware, Software, Applications, Lecture Notes in Computer Science, number 732, pages 246–264, Berlin, 1993. Springer-Verlag.

    Google Scholar 

  52. Kenneth Salem and Hector Garcia-Molina. Disk striping. In Proceedings of the 2 nd International Conference on Data Engineering, pages 336–342. ACM, February 1986.

    Google Scholar 

  53. David S. Scott. Parallel I/O and solving out of core systems of linear equations. In Proceedings of the 1993 DAGS/PC Symposium, pages 123–130, Hanover, NH, June 1993. Dartmouth Institute for Advanced Graduate Studies.

    Google Scholar 

  54. Elizabeth A. M. Shriver and Leonard F. Wisniewski. An API for choreographing data accesses. Technical Report PCS-TR95–267, Dartmouth College Department of Computer Science, October 1995.

    Google Scholar 

  55. Elizabeth A. M. Shriver, Leonard F. Wisniewski, Bruce G. Calder, David Green-berg, Ryan Moore, and David Womble. Parallel disk access using the Whiptail File System: Design and implementation. Manuscript, 1995.

    Google Scholar 

  56. James B. Sinclair, Jay Tang, and Peter J. Varman. Instability in parallel I/O systems. In IPPS’94 Workshop on Input/Output in Parallel Computer Systems, pages 16–35. Rice University, April 1994. Also appeared in Computer Architecture News 22(4).

    Google Scholar 

  57. Rajeev Thakur, Rajesh Bordawekar, Alok Choudhary, Ravi Ponnusamy, and Tarvinder Singh. PASSION runtime library for parallel I/O. In Proceedings of the Scalable Parallel Libraries Conference, pages 119–128, Mississippi State, MS, October 1994.

    Google Scholar 

  58. Rajeev Thakur and Alok Choudhary. An extended two-phase method for accessing sections of out-of-core arrays. Technical Report CACR-103, Scalable I/O Initiative, Center for Advanced Computing Research, California Institute of Technology, June 1995.

    Google Scholar 

  59. J. D. Ullman and M. Yannakakis. The input/output complexity of transitive closure. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pages 44–53, 1990. Also in Annals of Mathematics and Artificial Intelligence, 3, pages 331–360, 1991.

    Chapter  Google Scholar 

  60. Darren Erik Vengroff. A transparent parallel I/O environment. In Proceedings of the 1994 DAGS/PC Symposium, pages 117–134, Hanover, NH, July 1994.

    Google Scholar 

  61. Darren Erik Vengroff and Jeffrey Scott Vitter. I/O-efficient scientific computation using TPIE. Technical Report CS-1995–18, Duke University Dept. of Computer Science, 1995. A subset appears in Proceedings of 7th IEEE Symposium on Parallel and Distributed Processing, San Antonio, TX, October 1995.

    Google Scholar 

  62. Jeffrey Scott Vitter and Mark H. Nodine. Large-scale sorting in uniform memory hierarchies. Journal of Parallel and Distributed Computing, 17(1–2): 107–114, January and February 1993.

    Article  MATH  Google Scholar 

  63. Jeffrey Scott Vitter and Elizabeth A. M. Shriver. Optimal disk I/O with parallel block transfer. In Proceedings of the 22nd Annual ACM Symposium on Theory of Computing (STOC y 90), pages 159–169, May 1990.

    Chapter  Google Scholar 

  64. Jeffrey Scott Vitter and Elizabeth A. M. Shriver. Algorithms for parallel memory I: Two-level memories. Algorithmica, 12(2/3): 110–147, August and September 1994.

    Article  MathSciNet  MATH  Google Scholar 

  65. Jeffrey Scott Vitter and Elizabeth A. M. Shriver. Algorithms for parallel memory II: Hierarchical multilevel memories. Algorithmica, 12(2/3): 148–169, August and September 1994.

    Article  MathSciNet  MATH  Google Scholar 

  66. Leonard F. Wisniewski. Structured permuting in place on parallel disk systems. Technical Report PCS-TR95–265, Dartmouth College Department of Computer Science, September 1995.

    Google Scholar 

  67. David Womble, David Greenberg, Stephen Wheat, and Rolf Riesen. Beyond core: Making parallel computer I/O practical. In Proceedings of the 1993 DAGS/PC Symposium, pages 56–63, Hanover, NH, June 1993.

    Google Scholar 

  68. LuoQuan Zheng and Paul Larson. Speeding up external mergesort. IEEE Transactions on Data and Knowledge Engineering. To appear.

    Google Scholar 

  69. Binhai Zhu. Further computational geometry in secondary memory. In Proceedings of the 5th International Symposiun on Algorithms and Computation, Lecture Notes in Computer Science, pages 514–522, Beijing, P. R. China, August 1994. Springer-Verlag.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Kluwer Academic Publishers

About this chapter

Cite this chapter

Shriver, E., Nodine, M. (1996). An Introduction to Parallel I/O Models and Algorithms. In: Jain, R., Werth, J., Browne, J.C. (eds) Input/Output in Parallel and Distributed Computer Systems. The Kluwer International Series in Engineering and Computer Science, vol 362. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1401-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-1401-1_2

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4612-8607-3

  • Online ISBN: 978-1-4613-1401-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics