Abstract
Petascale machines with close to a million processors will soon be available. Although MPI is the dominant programming model today, some researchers and users wonder (and perhaps even doubt) whether MPI will scale to such large processor counts. In this paper, we examine this issue of how scalable is MPI. We first examine the MPI specification itself and discuss areas with scalability concerns and how they can be overcome. We then investigate issues that an MPI implementation must address to be scalable. We ran some experiments to measure MPI memory consumption at scale on up to 131,072 processes or 80% of the IBM Blue Gene/P system at Argonne National Laboratory. Based on the results, we tuned the MPI implementation to reduce its memory footprint. We also discuss issues in application algorithmic scalability to large process counts and features of MPI that enable the use of other techniques to overcome scalability limitations in applications.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
ADLB library, http://www.cs.mtsu.edu/~rbutler/adlb/
Barbay, J., Navarro, G.: Compressed representations of permutations, and applications. In: Proc. of 26th Int’l Symposium on Theoretical Aspects of Computer Science (STACS), pp. 111–122 (2009)
Bonachea, D., Duell, J.: Problems with using MPI 1.1 and 2.0 as compilation targets for parallel language implementations. In: 2nd Workshop on Hardware/Software Support for High Performance Sci. and Eng. Computing (2003)
Bosilca, G., Bouteiller, A., Cappello, F., Djilali, S., Fedak, G., Germain, C., Herault, T., Lemarinier, P., Lodygensky, O., Magniette, F., Neri, V., Selikhov, A.: MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes. In: Proc. of SC 2002. IEEE, Los Alamitos (2002)
Chapman, B., Jost, G., van der Pas, R.: Using OpenMP: Portable Shared Memory Parallel Programming. The MIT Press, Cambridge (2007)
Fagg, G.E., Dongarra, J.J.: FT-MPI: Fault tolerant MPI, supporting dynamic applications in a dynamic world. In: Dongarra, J., Kacsuk, P., Podhorszki, N. (eds.) PVM/MPI 2000. LNCS, vol. 1908, pp. 346–353. Springer, Heidelberg (2000)
Gropp, W.D., Lusk, E.: Fault tolerance in MPI programs. Int’l Journal of High Performance Computer Applications 18(3), 363–372 (2004)
Hoefler, T., Träff, J.L.: Sparse collective operations for MPI. In: Proc. of 14th Int’l Workshop on High-level Parallel Programming Models and Supportive Environments at IPDPS (2009)
Jitsumoto, H., Endo, T., Matsuoka, S.: ABARIS: An adaptable fault detection/recovery component framework for MPIs. In: Proc. of 12th IEEE Workshop on Dependable Parallel, Distributed and Network-Centric Systems (DPDNS 2007) in conjunction with IPDPS 2007 (March 2007)
Kumar, S., Dozsa, G., Berg, J., Cernohous, B., Miller, D., Ratterman, J., Smith, B., Heidelberger, P.: Architecture of the component collective messaging interface. In: Lastovetsky, A., Kechadi, T., Dongarra, J. (eds.) EuroPVM/MPI 2008. LNCS, vol. 5205, pp. 23–32. Springer, Heidelberg (2008)
MPI Forum fault tolerance working group, https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/FaultToleranceWikiPage
MPI Forum RMA working group, https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/RmaWikiPage
PETSc library, http://www.mcs.anl.gov/petsc
Pieper, S.C., Wiringa, R.B.: Quantum Monte Carlo Calculations of Light Nuclei. Annu. Rev. Nucl. Part. Sci. 51, 53 (2001)
Proposal for distributed graph topology, https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/33
Rabenseifner, R., Hager, G., Jost, G.: Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: Proc. of 17th Euromicro Int’l Conference on Parallel, Distributed, and Network-Based Processing (PDP 2009), February 2009, pp. 427–236 (2009)
Rane, A., Stanzione, D.: Experiences in tuning performance of hybrid MPI/OpenMP applications on quad-core systems. In: Proc. of 10th LCI Int’l Conference on High-Performance Clustered Computing (March 2009)
Ross, R., Miller, N., Gropp, W.: Implementing fast and reusable datatype processing. In: Dongarra, J., Laforenza, D., Orlando, S. (eds.) EuroPVM/MPI 2003. LNCS, vol. 2840, pp. 404–413. Springer, Heidelberg (2003)
Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. Int’l Journal of High-Performance Computing Applications 19(1), 49–66 (spring 2005)
Träff, J.L.: SMP-aware message passing programming. In: Proc. of 8th Int’l Workshop on High-level Parallel Programming Models and Supportive Environments at IPDPS 2003, pp. 56–65 (2003)
Träff, J.L.: A simple work-optimal broadcast algorithm for message-passing parallel systems. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. LNCS, vol. 3241, pp. 173–180. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Balaji, P. et al. (2009). MPI on a Million Processors. In: Ropo, M., Westerholm, J., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2009. Lecture Notes in Computer Science, vol 5759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03770-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-03770-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03769-6
Online ISBN: 978-3-642-03770-2
eBook Packages: Computer ScienceComputer Science (R0)