An Introduction to Parallel I/O Models and Algorithms

Shriver, Elizabeth; Nodine, Mark

doi:10.1007/978-1-4613-1401-1_2

Elizabeth Shriver³ &
Mark Nodine⁴

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 362))

148 Accesses

Abstract

Problems whose data are too large to fit into main memory are called out-of-core problems. Out-of-core parallel-I/O algorithms can handle much larger problems than in-memory variants and have much better performance than single-device variants. However, they are not commonly used—partly because the understanding of them is not widespread. Yet such algorithms ought to be growing in importance because they address the needs of users with ever-growing problem sizes and ever-increasing performance needs.

This paper addresses this lack of understanding by presenting an introduction to the data-transfer models on which most of the out-of-core parallel-I/O algorithms are based, with particular emphasis on the Parallel Disk Model. Sample algorithms are discussed to demonstrate the paradigms (algorithmic techniques) used with these models.

Our aim is to provide insight into both the paradigms and the particular algorithms described, thereby also providing a background for understanding a range of related solutions. It is hoped that this background would enable the appropriate selection of existing algorithms and the development of new ones for current and future out-of-core problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Performance Characterisation of the 64-Core SG2042 RISC-V CPU for HPC

Parallel programming models for heterogeneous many-cores: a comprehensive survey

Article 31 July 2020

Scalability and efficiency challenges for the exascale supercomputing system: practice of a parallel supporting environment on the Sunway exascale prototype system

Article 23 January 2023

References

A. Aggarwal, B. Alpern, A. K. Chandra, and M. Snir. A model for hierarchical memory. Technical report RC 15118, IBM Watson Research Center, October 1989. An earlier version appeared in Proceedings of Nineteenth Annual ACM Symposium on Theory of Computing, pages 305–314, New York, NY, May 1987.
Google Scholar
A. Aggarwal, A. Chandra, and M. Snir. On communication latency in PRAM computations. In Proceedings of the 1989 ACM Symposium on Parallel Algorithms and Architectures, pages 11–21, Santa Fe, NM, June 1989.
Chapter Google Scholar
Alok Aggarwal and C. Greg Plaxton. Optimal parallel sorting in multi-level storage. In Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 659–668, Arlington, VA, January 1994.
Google Scholar
Alok Aggarwal and Jeffrey Scott Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31 (9): 1116–1127, September 1988.
Article MathSciNet Google Scholar
A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.
MATH Google Scholar
B. Alpern, L. Carter, and E. Feig. Uniform memory hierarchies. In Proceedings of 31st Annual IEEE Symposium on Foundations of Computer Science, St. Louis, MO, October 1990.
Google Scholar
B. Alpern, L. Carter, E. Feig, and T. Selker. The uniform memory hierarchy model of computation. Algorithmica, 12(2/3):72–109, August and September 1994.
Article MathSciNet MATH Google Scholar
B. Alpern, L. Carter, and J. Ferrante. Modeling parallel computers as memory hierarchies. In W. K. Giloi, S. Jahnichen, and B. D. Shriver, editors, Working Conference on Massively Parallel Programming Models, pages 116–123, Berlin, Germany, September 1993.
Chapter Google Scholar
L. Arge, D. E. Vengroff, and J. S. Vitter. External-memory algorithms for processing line segments in geographic information systems. In Proceedings of 3rd Annual European Symposium of Algorithms, Lecture Notes in Computer Science, number 979, pages 295–310, Corfu, Greece, September 1995. Springer-Verlag.
Google Scholar
Lars Arge. The buffer tree: A new technique for optimal I/O-algorithms. In 4th International Workshop on Algorithms and Data Structures (Proceedings), Lecture Notes in Computer Science, number 955, pages 334–345, Kingston, Canada, August 1995. Springer-Verlag.
Google Scholar
Lars Arge, Mikael Knudsen, and Kirsten Larsen. A general lower bound on the I/O-complexity of comparison-based algorithms. In Proceedings of the 3rd Workshop of Algorithms and Data Structures, number 709 in Lecture Notes in Computer Science, pages 83–94. Springer-Verlag, August 1993.
Google Scholar
Micah Beck, Dina Bitton, and W. Kevin Wilkinson. Sorting large files on a backend multiprocessor. IEEE Transactions on Computers, (7):769–778, July 1988.
Google Scholar
Dina Bitton, David J. De Witt, David K. Hsiao, and Jaishankar Menon. A taxonomy of parallel sorting. Computing Surveys, (3):287–318, September 1984.
Google Scholar
Peter Brezany, Thomas A. Mueck, and Erich Schikuta. Language, compiler and parallel database support for I/O intensive applications. In High Performance Computing and Networking 1995 Europe, pages 14–20, Lecture Notes in Computer Science, number 919, May 1995. Springer-Verlag.
Chapter Google Scholar
Jean-Philippe Bruner, Palle Pedersen, and S. Lennart Johnsson. Load-balanced LU and QR factor and solve routines for scalable processors with scalable I/O. In Proceedings of the 17th IMACS World Congress, July 1994.
Google Scholar
P. Chen, G. Gibson, R. H. Katz, D. A. Patterson, and M. Schulze. Two papers on RAIDs. UCB/CSD 88 479, University of California at Berkeley, December 1988.
Google Scholar
Y.-J. Chiang. Experiments on the practical I/O efficiency of geometric algorithms: Distribution sweep vs. plane sweep. In 4th International workshop on Algorithms and Data Structures (Proceedings), Lecture Notes in Computer Science, number 955, Kingston, Canada, August 1995. Springer-Verlag.
Google Scholar
Yi-Jen Chiang, Michael T. Goodrich, Edward F. Grove, Roberto Tamassia, Darren Erik Vengroff, and Jeffrey Scott Vitter. External-memory graph algorithms. In Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 139–149, San Francisco, CA, January 1995.
Google Scholar
Mark J. Clement and Michael J. Quinn. Overlapping computations, communications and I/O in parallel sorting. Journal of Parallel and Distributed Computing, 28:162–172, August 1995.
Article MATH Google Scholar
Thomas H. Cormen. Virtual Memory for Data-Parallel Computing. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 1992. Available as Technical Report MIT/LCS/TR-559.
Google Scholar
Thomas H. Cormen. Fast permuting on disk arrays. Journal of Parallel and Distributed Computing, 17(1–2):41–57, January and February 1993.
MATH Google Scholar
Thomas H. Cormen and Alex Colvin. ViC*: A preprocessor for virtual-memory C*. Technical Report PCS-TR94–243, Dartmouth College Department of Computer Science, November 1994.
Google Scholar
Thomas H. Cormen and David Kotz. Integrating theory and practice in parallel file systems. Technical Report PCS-TR93–188, Dartmouth College Department of Computer Science, September 1994. Earlier version appeared in Proceedings of the 1993 DAGS/PC Symposium, Hanover, NH, pages 64–74, June 1993.
Google Scholar
Thomas H. Cormen, Thomas Sundquist, and Leonard F. Wisniewski. Asymptotically tight bounds for performing BMMC permutations on parallel disk systems. Technical Report PCS-TR94–223, Dartmouth College Department of Computer Science, July 1994. Extended abstract appeared in Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, Velen, Germany, June 1993.
Google Scholar
Scott R. Cushman. A multiple discrete pass algorithm on a DEC Alpha 2100. Technical Report PCS-TR95–259, Dartmouth College Department of Computer Science, June 1995.
Google Scholar
Tom Cwik, Jean Patterson, and David Scott. Electromagnetic scattering calculations on the Intel Touchstone Delta. In Proceedings of Supercomputing’92, pages 538–542, November 1992.
Chapter Google Scholar
R. Cypher and J. L. C. Sanz. Cubesort: A parallel algorithm for sorting N data items with S-sorters. Journal of Algorithms, 13(2): 211–234, June 1992.
Article MathSciNet MATH Google Scholar
Robert Cypher and C. Greg Plaxton. Deterministic sorting in nearly logarithmic time on the hypercube and related computers. Journal of Computer and System Sciences, 47, 1993.
Google Scholar
David J. De Witt, Jeffrey F. Naughton, and Donovan A. Schneider. Parallel sorting on a shared-nothing architecture using probabilistic splitting. In Proceedings of the First International Conference on Parallel and Distributed Information Systems, pages 280–291, Miami Beach, FL, December 1991.
Chapter Google Scholar
Shimon Even. Parallelism in tape-sorting. Communications of the ACM, 17(4):202–204, April 1974.
Article MathSciNet MATH Google Scholar
Dror G. Feitelson, Peter F. Corbett, Yarsun Hsu, and Jean-Pierre Prost. Parallel I/O systems and interfaces for parallel computers. In C.-L. Wu, editor, Multiprocessor Systems—Design and Integration. World Scientific, 1996. To appear.
Google Scholar
Dror G. Feitelson, Peter F. Corbett, Sandra Johnson, and Yarsun Hsu. Satisfying the I/O requirements of massively parallel supercomputers. Technical Report RC 19008 (83016), IBM Watson Research Center, July 1993.
Google Scholar
R. W. Floyd. Permuting information in idealized two-level storage. In R. Miller and J. Thatcher, editors, Complexity of Computer Calculations, pages 105–109. Plenum Press, New York, 1972.
Google Scholar
N. Galbreath, W. Gropp, and D. Levine. Applications-driven parallel I/O. In Proceedings of Supercomputing’93, pages 462–471, 1993.
Google Scholar
Garth A. Gibson. Redundant Disk Arrays: Reliable, Parallel Secondary Storage. ACM Distinguished Dissertations. MIT Press, 1992.
Google Scholar
M. H. Goodrich, J.-J. Tsay, D. E. Vengroff, and J. S. Vitter. External-memory computational geometery. In Proceedings of the 34th Annual Symposium on Foundations of Computer Science, pages 714–723, Palo Alto, CA, November 1993.
Google Scholar
Robert Y. Hou and Yale N. Patt. Comparing rebuild algorithms for mirrored and RAID5 disk arrays. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 317–326. ACM, May 1993.
Chapter Google Scholar
K. Klimkowski and R. A. van de Geijn. Anatomy of a parallel out-of-core dense linear solver. In Proceedings of the 1995 International Conference on Parallel Processing, pages 111:29–33, Oconomowoc, WI, August 1995.
Google Scholar
D. Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching. Addison Wesley, Reading, MA, 1973.
Google Scholar
David Kotz. Disk-directed I/O for MIMD multiprocessors. In Proceedings of the First USENIX Symposium on Operating Systems Design and Implementation, pages 61–1 A, Monterey, California, November 1994. Updated as Dartmouth TR PCS-TR94–226 on November 8, 1994.
Google Scholar
David Kotz. Disk-directed I/O for an out-of-core computation. In Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing, pages 159–166, Pentagon City, Virginia, August 1995.
Chapter Google Scholar
Jang Sun Lee, Sanjay Ranka, and Ravi V. Shankar. Communication-efficient and memory-bounded external redistribution. Technical report, Syracuse University, 1995.
Google Scholar
T. Leighton. Tight bounds on the complexity of parallel sorting. In IEEE Transactions of Computers, pages 344–354, April 1985.
Google Scholar
Sean S.B. Moore and Leonard F. Wisniewski. Complexity analysis of two permutations used by fast cosine transform algorithms. Technical Report Technical Report PCS-TR95–266, Dartmouth College Department of Computer Science, October 1995.
Google Scholar
M. H. Nodine, M. T. Goodrich, and J. S. Vitter. Blocking for external graph searching. In Proceedings of the ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems, pages 222–232, Washington, DC, May 1993.
Google Scholar
Mark H. Nodine and Jeffrey S. Vitter. Greed sort: Optimal deterministic sorting on parallel disks. Journal of the ACM, 42:919–933, 1995.
Article MathSciNet Google Scholar
Mark H. Nodine and Jeffrey Scott Vitter. Optimal deterministic sorting in parallel memory hierarchies. Technical Report CS-92–38, Brown University, August 1992.
Google Scholar
Mark H. Nodine and Jeffrey Scott Vitter. Optimal deterministic sorting on parallel disks. Technical Report CS-92–08, Brown University, August 1992.
Google Scholar
Mark H. Nodine and Jeffrey Scott Vitter. Deterministic distribution sort in shared and distributed memory multiprocessors. In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 120–129, Velen, Germany, June 1993.
Chapter Google Scholar
David A. Patterson, Garth Gibson, and Randy H. Katz. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the ACM-SIGMOD International Conference on Management of Data, pages 109–116, Chicago, IL, June 1988.
Google Scholar
Markus Pawlowski and Rudolf Bayer. Parallel sorting of large data volumes on distributed memory multiprocessors. In Parallel Computer Architectures: Theory, Hardware, Software, Applications, Lecture Notes in Computer Science, number 732, pages 246–264, Berlin, 1993. Springer-Verlag.
Google Scholar
Kenneth Salem and Hector Garcia-Molina. Disk striping. In Proceedings of the 2 ^nd International Conference on Data Engineering, pages 336–342. ACM, February 1986.
Google Scholar
David S. Scott. Parallel I/O and solving out of core systems of linear equations. In Proceedings of the 1993 DAGS/PC Symposium, pages 123–130, Hanover, NH, June 1993. Dartmouth Institute for Advanced Graduate Studies.
Google Scholar
Elizabeth A. M. Shriver and Leonard F. Wisniewski. An API for choreographing data accesses. Technical Report PCS-TR95–267, Dartmouth College Department of Computer Science, October 1995.
Google Scholar
Elizabeth A. M. Shriver, Leonard F. Wisniewski, Bruce G. Calder, David Green-berg, Ryan Moore, and David Womble. Parallel disk access using the Whiptail File System: Design and implementation. Manuscript, 1995.
Google Scholar
James B. Sinclair, Jay Tang, and Peter J. Varman. Instability in parallel I/O systems. In IPPS’94 Workshop on Input/Output in Parallel Computer Systems, pages 16–35. Rice University, April 1994. Also appeared in Computer Architecture News 22(4).
Google Scholar
Rajeev Thakur, Rajesh Bordawekar, Alok Choudhary, Ravi Ponnusamy, and Tarvinder Singh. PASSION runtime library for parallel I/O. In Proceedings of the Scalable Parallel Libraries Conference, pages 119–128, Mississippi State, MS, October 1994.
Google Scholar
Rajeev Thakur and Alok Choudhary. An extended two-phase method for accessing sections of out-of-core arrays. Technical Report CACR-103, Scalable I/O Initiative, Center for Advanced Computing Research, California Institute of Technology, June 1995.
Google Scholar
J. D. Ullman and M. Yannakakis. The input/output complexity of transitive closure. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pages 44–53, 1990. Also in Annals of Mathematics and Artificial Intelligence, 3, pages 331–360, 1991.
Chapter Google Scholar
Darren Erik Vengroff. A transparent parallel I/O environment. In Proceedings of the 1994 DAGS/PC Symposium, pages 117–134, Hanover, NH, July 1994.
Google Scholar
Darren Erik Vengroff and Jeffrey Scott Vitter. I/O-efficient scientific computation using TPIE. Technical Report CS-1995–18, Duke University Dept. of Computer Science, 1995. A subset appears in Proceedings of 7th IEEE Symposium on Parallel and Distributed Processing, San Antonio, TX, October 1995.
Google Scholar
Jeffrey Scott Vitter and Mark H. Nodine. Large-scale sorting in uniform memory hierarchies. Journal of Parallel and Distributed Computing, 17(1–2): 107–114, January and February 1993.
Article MATH Google Scholar
Jeffrey Scott Vitter and Elizabeth A. M. Shriver. Optimal disk I/O with parallel block transfer. In Proceedings of the 22nd Annual ACM Symposium on Theory of Computing (STOC ^y 90), pages 159–169, May 1990.
Chapter Google Scholar
Jeffrey Scott Vitter and Elizabeth A. M. Shriver. Algorithms for parallel memory I: Two-level memories. Algorithmica, 12(2/3): 110–147, August and September 1994.
Article MathSciNet MATH Google Scholar
Jeffrey Scott Vitter and Elizabeth A. M. Shriver. Algorithms for parallel memory II: Hierarchical multilevel memories. Algorithmica, 12(2/3): 148–169, August and September 1994.
Article MathSciNet MATH Google Scholar
Leonard F. Wisniewski. Structured permuting in place on parallel disk systems. Technical Report PCS-TR95–265, Dartmouth College Department of Computer Science, September 1995.
Google Scholar
David Womble, David Greenberg, Stephen Wheat, and Rolf Riesen. Beyond core: Making parallel computer I/O practical. In Proceedings of the 1993 DAGS/PC Symposium, pages 56–63, Hanover, NH, June 1993.
Google Scholar
LuoQuan Zheng and Paul Larson. Speeding up external mergesort. IEEE Transactions on Data and Knowledge Engineering. To appear.
Google Scholar
Binhai Zhu. Further computational geometry in secondary memory. In Proceedings of the 5th International Symposiun on Algorithms and Computation, Lecture Notes in Computer Science, pages 514–522, Beijing, P. R. China, August 1994. Springer-Verlag.
Google Scholar

Download references

Author information

Authors and Affiliations

Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, New York, 10012, USA
Elizabeth Shriver
Motorola Cambridge Research Center, One Kendall Square, Building 200, Cambridge, MA, 02139, USA
Mark Nodine

Authors

Elizabeth Shriver
View author publications
You can also search for this author in PubMed Google Scholar
Mark Nodine
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Bell Communications Research, Morristown, New Jersey, USA
Ravi Jain
University of Texas at Austin, Austin, Texas, USA
John Werth & James C. Browne &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shriver, E., Nodine, M. (1996). An Introduction to Parallel I/O Models and Algorithms. In: Jain, R., Werth, J., Browne, J.C. (eds) Input/Output in Parallel and Distributed Computer Systems. The Kluwer International Series in Engineering and Computer Science, vol 362. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1401-1_2

Download citation

DOI: https://doi.org/10.1007/978-1-4613-1401-1_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8607-3
Online ISBN: 978-1-4613-1401-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics