MPI-Performance-Aware-Reallocation: method to optimize the mapping of processes applied to a cloud infrastructure

Gomez-Folgar, F.; Indalecio, G.; Seoane, N.; Pena, T. F.; Garcia-Loureiro, A. J.

doi:10.1007/s00607-017-0573-6

MPI-Performance-Aware-Reallocation: method to optimize the mapping of processes applied to a cloud infrastructure

Published: 19 August 2017

Volume 100, pages 211–226, (2018)
Cite this article

Computing Aims and scope Submit manuscript

F. Gomez-Folgar ORCID: orcid.org/0000-0001-6865-6645¹,
G. Indalecio¹,
N. Seoane¹,
T. F. Pena¹ &
…
A. J. Garcia-Loureiro¹

292 Accesses
1 Citation
Explore all metrics

Abstract

The cloud brings new possibilities to run traditional HPC applications, giving its flexibility and reduced cost. However, running MPI applications in the cloud can reduce appreciably its performance, because the cloud hides its internal network topology information, and existing topology-aware techniques to optimize MPI communications cannot be directly applied to virtualized infrastructures. In this paper it is presented the MPI-Performance-Aware-Reallocation method (MPAR), a general approach to improve MPI communications. This new approach: (i) is not linked to any specific software or hardware infrastructure, (ii) is applicable to cloud, (iii) abstracts the network topology performing experimental tests, and (iv) is able to improve the performance of the MPI users application via the reallocation of the involved MPI processes. The MPAR has been demonstrated for cloud infrastructures, via the implementation of the Latency-Aware-MPI-Cloud-Scheduler (LAMPICS) layer. LAMPICS is able to improve the latency of MPI communications in clouds, without the need of creating ad-hoc MPI implementations or modifying the source code of user’s MPI applications. We have tested LAMPICS with the Sendrecv micro benchmark provided by the Intel MPI Benchmarks, with performance improvements of up to 70%, and with two real-world applications from the Unified European Applications Benchmark Suite, obtaining performance improvements of up to 26.5%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Boosting HPC Applications in the Cloud Through JIT Traffic-Aware Path Provisioning

Janus: a framework to boost HPC applications in the cloud based on SDN path provisioning

Article 23 November 2021

Guilherme R. Pretto, Bruno L. Dalmazo, … Luciano P. Gaspary

An Emulation Layer for Dynamic Resources with MPI Sessions

References

Al-Tawil K, Moritz CA (2001) Performance modeling and evaluation of MPI. J Parallel Distrib Comput 61(2):202–223. doi:10.1006/jpdc.2000.1677
Article MATH Google Scholar
Enkovaara J, Rostgaard C, Mortensen JJ, Chen J, Duak M, Ferrighi L, Gavnholt J, Glinsvad C, Haikola V, Hansen HA, Kristoffersen HH, Kuisma M, Larsen AH, Lehtovaara L, Ljungberg M, Lopez-Acevedo O, Moses PG, Ojanen J, Olsen T, Petzold V, Romero NA, Stausholm-Møller J, Strange M, Tritsaris GA, Vanin M, Walter M, Hammer B, Häkkinen H, Madsen GKH, Nieminen RM, Nørskov JK, Puska M, Rantala TT, Schiøtz J, Thygesen KS, Jacobsen KW (2010) Electronic structure calculations with GPAW: a real-space implementation of the projector augmented-wave method. J Phys Condens Matter 22(25):253,202. doi:10.1088/0953-8984/22/25/253202
Article Google Scholar
Gong Y, He B, Zhong J (2015) Network performance aware MPI collective communication operations in the cloud. IEEE Trans Parallel Distrib Syst 26(11):3079–3089. doi:10.1109/TPDS.2013.96
Article Google Scholar
Hurwitz JG, Feng Wc (2005) Analyzing MPI performance over 10-gigabit ethernet. J Parallel Distrib Comput 65(10):1253–1260. doi:10.1016/j.jpdc.2005.04.011
Article Google Scholar
Intel Corporation (2016) Intel MPI Benchmarks, User Guide and Methodology Description. http://software.intel.com/en-us/articles/intel-mpi-benchmarks/. Accessed 17 July 2017
Jackson KR, Ramakrishnan L, Muriki K, Canon S, Cholia S, Shalf J, Wasserman HJ, Wright NJ (2010) Performance analysis of high performance computing applications on the amazon web services cloud. In: 2010 IEEE second international conference on cloud computing technology and science, IEEE, pp 159–168. doi:10.1109/CloudCom.2010.69. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5708447
Kandalla K, Subramoni H, Vishnu A, Panda DK (2010) Designing topology-aware collective communication algorithms for large scale InfiniBand clusters: case studies with scatter and gather. In: 2010 IEEE international symposium on parallel and distributed processing, workshops and Ph.d. forum (IPDPSW), IEEE, pp 1–8. doi:10.1109/IPDPSW.2010.5470853. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5470853
Le TT, Rejeb J (2006) A detailed MPI communication model for distributed systems. Future Gener Comput Syst 22(3):269–278. doi:10.1016/j.future.2005.08.005
Article Google Scholar
Liu J, Chandrasekaran B, Wu J, Jiang W, Kini S, Yu W, Buntinas D, Wyckoff P, Panda DK (2003) Performance comparison of MPI implementations over InfiniBand, Myrinet and Quadrics. In: Proceedings of the 2003 ACM/IEEE conference on supercomputing—SC ’03, ACM Press, New York, New York, USA, p 58. doi:10.1145/1048935.1050208. http://portal.acm.org/citation.cfm?doid=1048935.1050208
Martinez DR, Cabaleiro JC, Pena TF, Rivera FF, Blanco V (2009) Accurate analytical performance model of communications in MPI applications. In: 2009 IEEE international symposium on parallel and distributed processing, IEEE, pp 1–8. doi:10.1109/IPDPS.2009.5161175. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5161175
Rak M, Turtur M, Villano U (2014) Early Prediction of the Cost of HPC Application Execution in the Cloud. In: 2014 16th international symposium on symbolic and numeric algorithms for scientific computing, IEEE, pp 409–416. doi:10.1109/SYNASC.2014.61. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7034711
Schulz M, Bhatele A, Bremer PT, Gamblin T, Isaacs K, Levine JA, Pascucci V (2012) Creating a tool set for optimizing topology-aware node mappings. In: Brunst H, Müller MS, Nagel WE, Resch MM (eds) Tools for high performance computing 2011, chap. 1, Springer, Berlin, pp 1–12. doi:10.1007/978-3-642-31476-6_1
Skinner D (2005) Performance monitoring of parallel scientific applications. Technical report, Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA . doi:10.2172/881368. http://www.osti.gov/servlets/purl/881368-dOvpFA/
Spiridon VL, Slusanschi EI (2013) N-body simulations with GADGET-2. In: 2013 15th international symposium on symbolic and numeric algorithms for scientific computing, IEEE, pp 526–533. doi:10.1109/SYNASC.2013.75. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6821192
Springel V (2005) The cosmological simulation code GADGET-2. Mon Not R Astron Soc 364(4):1105–1134. doi:10.1111/j.1365-2966.2005.09655.x
Article Google Scholar
Subramoni H, Kandalla K, Vienne J, Sur S, Barth B, Tomko K, Mclay R, Schulz K, Panda D (2011) Design and evaluation of network topology-/speed-aware broadcast algorithms for InfiniBand clusters. In: 2011 IEEE international conference on cluster computing, IEEE, pp 317–325. doi:10.1109/CLUSTER.2011.43. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6061150
Bull M (2013) Unified European Applications Benchmark Suite. Seventh Framework Programme Research Infrastructures. European High Performance Computing (HPC) service PRACE. http://www.prace-ri.eu/ueabs/. Accessed 13 July 2017
Ye K, Jiang X, Ma R, Yan F (2012) VC-migration: live migration of virtual clusters in the cloud. In: 2012 ACM/IEEE 13th international conference on grid computing, IEEE, pp 209–218. doi:10.1109/Grid.2012.27. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6319172
Zhai Y, Liu M, Zhai J, Ma X, Chen W (2011) Cloud versus in-house cluster. In: State of the practice reports on—SC ’11, ACM Press, New York, New York, USA, p 1. doi:10.1145/2063348.2063363. http://dl.acm.org/citation.cfm?doid=2063348.2063363

Download references

Author information

Authors and Affiliations

Centro Singular de Investigación en Tecnoloxías da Información (CiTIUS), Universidade de Santiago de Compostela, Rúa de Jenaro de la Fuente Domínguez, 15782, Santiago de Compostela, Spain
F. Gomez-Folgar, G. Indalecio, N. Seoane, T. F. Pena & A. J. Garcia-Loureiro

Authors

F. Gomez-Folgar
View author publications
You can also search for this author in PubMed Google Scholar
G. Indalecio
View author publications
You can also search for this author in PubMed Google Scholar
N. Seoane
View author publications
You can also search for this author in PubMed Google Scholar
T. F. Pena
View author publications
You can also search for this author in PubMed Google Scholar
A. J. Garcia-Loureiro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to F. Gomez-Folgar.

Additional information

This work has been supported by FEDER funds and by Spanish Government (MCYT) under projects TIN-2013-41129-P, TIN-2016-76373-P and TEC2014-59402-JIN, and by the Spanish Ministry of Education, Culture and Sports under FPU grants FPU12/05190 and FPU12/02916.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gomez-Folgar, F., Indalecio, G., Seoane, N. et al. MPI-Performance-Aware-Reallocation: method to optimize the mapping of processes applied to a cloud infrastructure. Computing 100, 211–226 (2018). https://doi.org/10.1007/s00607-017-0573-6

Download citation

Received: 09 August 2016
Accepted: 10 August 2017
Published: 19 August 2017
Issue Date: February 2018
DOI: https://doi.org/10.1007/s00607-017-0573-6

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

MPI-Performance-Aware-Reallocation: method to optimize the mapping of processes applied to a cloud infrastructure

Abstract

Access this article

Similar content being viewed by others

Boosting HPC Applications in the Cloud Through JIT Traffic-Aware Path Provisioning

Janus: a framework to boost HPC applications in the cloud based on SDN path provisioning

An Emulation Layer for Dynamic Resources with MPI Sessions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

MPI-Performance-Aware-Reallocation: method to optimize the mapping of processes applied to a cloud infrastructure

Abstract

Access this article

Similar content being viewed by others

Boosting HPC Applications in the Cloud Through JIT Traffic-Aware Path Provisioning

Janus: a framework to boost HPC applications in the cloud based on SDN path provisioning

An Emulation Layer for Dynamic Resources with MPI Sessions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation