Data Locality Aware Strategy for Two-Phase Collective I/O

Filgueira, Rosa; Singh, David E.; Pichel, Juan C.; Isaila, Florin; Carretero, Jesús

doi:10.1007/978-3-540-92859-1_14

Rosa Filgueira⁶,
David E. Singh⁶,
Juan C. Pichel⁶,
Florin Isaila⁶ &
…
Jesús Carretero⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5336))

Included in the following conference series:

International Conference on High Performance Computing for Computational Science

1185 Accesses

Abstract

This paper presents Locality-Aware Two-Phase (LATP) I/O, an optimization of the Two-Phase collective I/O technique from ROMIO, the most popular MPI-IO implementation. In order to increase the locality of the file accesses, LATP employs the Linear Assignment Problem (LAP) for finding an optimal distribution of data to processes, an aspect that is not considered in the original technique. This assignment is based on the local data that each process stores and has as main purpose the reduction of the number of communication involved in the I/O collective operation and, therefore, the improvement of the global execution time. Compared with Two-Phase I/O, LATP I/O obtains important improvements in most of the considered scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Sparbit: Towards to a Logarithmic-Cost and Data Locality-Aware MPI Allgather Algorithm

Article 16 March 2023

A High-Performance Collective I/O Framework Leveraging Node-Local Persistent Memory

Hierarchical redesign of classic MPI reduction algorithms

Article 18 June 2016

References

Blackman, S.S.: Multiple-Target Tracking with Radar Applications. Artech House, Dedham (1986)
Google Scholar
Bordawekar, R.: Implementation of Collective I/O in the Intel Paragon Parallel File System: Initial Experiences. In: Proc. 11th International Conference on Supercomputing (July 1997) (to appear)
Google Scholar
del Rosario, J., Bordawekar, R., Choudhary, A.: Improved parallel I/O via a two-phase run-time access strategy. In: Proc. of IPPS Workshop on Input/Output in Parallel Computer Systems (1993)
Google Scholar
Giorgio Carpaneto, S.M., Toth, P.: Algorithms and codes for the assignment problem. Annals of Operations Research 13(1), 191–223 (1988)
Article MathSciNet MATH Google Scholar
Jonker, R., Volgenant, A.: A Shortest Augmenting Path Algorithm for Dense and Sparse Linear Assignment Problems. Computing 38(4), 325–340 (1987)
Article MathSciNet MATH Google Scholar
Karypis, G., Kumar, V.: METIS — A software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices. Technical report, Department of Computer Science/Army HPC Research Center, University of Minnesota, Minneapolis (1998)
Google Scholar
Kotz, D.: Disk-directed I/O for MIMD Multiprocesses. In: Proc. of the First USENIX Symp. on Operating Systems Design and Implementation (1994)
Google Scholar
Loureiro, A., González, J., Pena, T.F.: A parallel 3d semiconductor device simulator for gradual heterojunction bipolar transistors. Journal of Numerical Modelling: electronic networks, devices and fields 16, 53–66 (2003)
Article MATH Google Scholar
Seamons, K., Chen, Y., Jones, P., Jozwiak, J., Winslett, M.: Server-directed collective I/O in Panda. In: Proceedings of Supercomputing 1995 (1995)
Google Scholar
Thakur, R., Gropp, W., Lusk, E.: Data Sieving and Collective I/O in ROMIO. In: Proc. of the 7th Symposium on the Frontiers of Massively Parallel Computation, pp. 182–189 (February 1999)
Google Scholar
Ligon, W., Ross, R.: An Overview of the Parallel Virtual File System. In: Proceedings of the Extreme Linux Workshop (June 1999)
Google Scholar
C.F.S. Inc. Lustre: A scalable, high-performance file system. Cluster File Systems Inc. white paper, version 1.0 (November 2002), http://www.lustre.org/docs/whitepaper.pdf
Indiana University, LAM website, http://www.lam-mpi.org/
Isaila, F., Malpohl, G., Olaru, V., Szeder, G., Tichy, W.: Integrating Collective I/O and Cooperative Caching into the “Clusterfile” Parallel File System. In: Proceedings of ACM International Conference on Supercomputing (ICS), pp. 315–324. ACM Press, New York (2004)
Google Scholar
Schmuck, F., Haskin, R.: GPFS: A Shared-Disk File System for Large Computing Clusters. In: Proceedings of FAST (2002)
Google Scholar
Thakur, R., Gropp, W., Lusk, E.: Optimizing Noncontiguous Accesses in MPI-IO. Parallel Computing 28(1), 83–105 (2002)
Article MATH Google Scholar
Liao, W.K., Coloma, K., Choudhary, A., Ward, L., Russel, E., Tideman, S.: Collective Caching: Application-Aware Client-Side File Caching. In: Proceedings of the 14th International Symposium on High Performance Distributed Computing (HPDC) (July 2005)
Google Scholar
Keng Liao, W., Coloma, K., Choudhary, A.N., Ward, L.: Cooperative Write-Behind Data Buffering for MPI I/O. In: PVM/MPI, pp. 102–109 (2005)
Google Scholar
Nieuwejaar, N., Kotz, D., Purakayastha, A., Ellis, C., Best, M.: File Access Characteristics of Parallel Scientific Workloads. IEEE Transactions on Parallel and Distributed Systems 7(10) (October 1996)
Google Scholar
Simitici, H., Reed, D.: A Comparison of Logical and Physical Parallel I/O Patterns. In: International Journal of High Performance Computing Applications, special issue (I/O in Parallel Applications), vol. 12(3) (1998)
Google Scholar
Seamons, K., Chen, Y., Jones, P., Jozwiak, J., Winslett, M.: Server-directed collective I/O in Panda. In: Proceedings of Supercomputing 1995 (1995)
Google Scholar
Yu, W., Vetter, J., Canon, R.S., Jiang, S.: Exploiting Lustre File Joining for Effective Collective I/O. In: CCGRID 2007: Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid, Washington, DC, USA, pp. 267–274. IEEE Computer Society, Los Alamitos (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Universidad Carlos III de Madrid, Spain
Rosa Filgueira, David E. Singh, Juan C. Pichel, Florin Isaila & Jesús Carretero

Authors

Rosa Filgueira
View author publications
You can also search for this author in PubMed Google Scholar
David E. Singh
View author publications
You can also search for this author in PubMed Google Scholar
Juan C. Pichel
View author publications
You can also search for this author in PubMed Google Scholar
Florin Isaila
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Carretero
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculdade de Engenharia da Universidade do Porto, Rua Dr. Roberto Frias s/n, 4200-465, Porto, Portugal
José M. Laginha M. Palma
University of Toulouse, INP (ENSEEIHT), IRIT, 2 rue Charles-Camichel, 31071, Toulouse CEDEX 7, France
Patrick R. Amestoy
University of Toulouse, INP (ENSEEIHT); IRIT, rue Charles-Camichel, 31071, Toulouse CEDEX 7, France
Michel Daydé
Federal University of Rio de Janeiro, P.O. Box 68511, 21941-972, Rio de Janeiro, RJ, Brazil
Marta Mattoso
Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias, s/n, 4200-465, Porto, Portugal
João Correia Lopes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Filgueira, R., Singh, D.E., Pichel, J.C., Isaila, F., Carretero, J. (2008). Data Locality Aware Strategy for Two-Phase Collective I/O. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2008. VECPAR 2008. Lecture Notes in Computer Science, vol 5336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92859-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-540-92859-1_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92858-4
Online ISBN: 978-3-540-92859-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Data Locality Aware Strategy for Two-Phase Collective I/O