Supporting Malleability in Parallel Architectures with Dynamic CPUSETs Mapping and Dynamic MPI

Cera, Márcia C.; Georgiou, Yiannis; Richard, Olivier; Maillard, Nicolas; Navaux, Philippe O. A.

doi:10.1007/978-3-642-11322-2_26

Márcia C. Cera²⁰,
Yiannis Georgiou²¹,
Olivier Richard²¹,
Nicolas Maillard²⁰ &
…
Philippe O. A. Navaux²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5935))

Included in the following conference series:

International Conference on Distributed Computing and Networking

870 Accesses

Abstract

Current parallel architectures take advantage of new hardware evolution, like the use of multicore machines in clusters and grids. The availability of such resources may also be dynamic. Therefore, some kind of adaptation is required by the applications and the resource manager to perform a good resource utilization. Malleable applications can provide a certain flexibility, adapting themselves on-the-fly, according to variations in the amount of available resources. However, to enable the execution of this kind of applications, some support from the resource manager is required, thus introducing important complexities like special allocation and scheduling policies. Under this context, we investigate some techniques to provide malleable behavior on MPI applications and the impact of this support upon a resource manager. Our study deals with two approaches to obtain malleability: dynamic CPUSETsmapping and dynamic MPI, using the OAR resource manager. The validation experiments were conducted upon Grid5000 platform. The testbed associates the charge of real workload traces and the execution of MPI benchmarks. Our results show that a dynamic approach using malleable jobs can lead to almost 25% of improvement in the resources utilization, when compared to a non-dynamic approach. Furthermore, the complexity of the malleability support, for the resource manager, seems to be overlapped by the improvement reached.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An Application-Level Solution for the Dynamic Reconfiguration of MPI Applications

Proteo: a framework for the generation and evaluation of malleable MPI applications

Article Open access 02 July 2024

E-OSched: a load balancing scheduler for heterogeneous multicores

Article 23 May 2018

References

Feitelson, D.G., Rudolph, L.: Toward convergence in job schedulers for parallel supercomputers. In: Job Scheduling Strategies for Parallel Processing, pp. 1–26. Springer, Heidelberg (1996)
Chapter Google Scholar
Capit, N., Costa, G.D., Georgiou, Y., Huard, G., Martin, C., Mounié, G., Neyron, P., Richard, O.: A batch scheduler with high level components. In: 5th Int. Symposium on Cluster Computing and the Grid, Cardiff, UK, pp. 776–783. IEEE, Los Alamitos (2005)
Chapter Google Scholar
Lepère, R., Trystram, D., Woeginger, G.J.: Approximation algorithms for scheduling malleable tasks under precedence constraints. International Journal of Foundations of Computer Science 13(4), 613–627 (2002)
Article MathSciNet MATH Google Scholar
Kalé, L.V., Kumar, S., DeSouza, J.: A malleable-job system for timeshared parallel machines. In: 2nd Int. Symposium on Cluster Computing and the Grid, Washington, USA, pp. 230–238. IEEE, Los Alamitos (2002)
Google Scholar
Hungershöfer, J.: On the combined scheduling of malleable and rigid jobs. In: 16th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 206–213 (2004)
Google Scholar
Utrera, G., Corbalán, J., Labarta, J.: Implementing malleability on MPI jobs. In: 13th Int. Conference on Parallel Architectures and Compilation Techniques, pp. 215–224. IEEE, Los Alamitos (2004)
Chapter Google Scholar
Hungershöfer, J., Achim Streit, J.M.W.: Efficient resource management for malleable applications. Technical Report TR-003-01, Paderborn Center for Parallel Computing (2001)
Google Scholar
El Maghraoui, K., Desell, T.J., Szymanski, B.K., Varela, C.A.: Malleable iterative mpi applications. Concurrency and Computation: Practice and Experience 21(3), 393–413 (2009)
Article Google Scholar
Maghraoui, K.E., Desell, T.J., Szymanski, B.K., Varela, C.A.: Dynamic malleability in iterative MPI applications. In: 7th Int. Symposium on Cluster Computing and the Grid, pp. 591–598. IEEE, Los Alamitos (2007)
Chapter Google Scholar
Desell, T., Maghraoui, K.E., Varela, C.A.: Malleable applications for scalable high performance computing. Cluster Computing 10(3), 323–337 (2007)
Article Google Scholar
Buisson, J., Sonmez, O., Mohamed, H., Epema, D.: Scheduling malleable applications in multicluster systems. In: Int. Conference on Cluster Computing, pp. 372–381. IEEE, Los Alamitos (2007)
Google Scholar
Bolze, R., Cappello, F., Caron, E., Dayd, M., Desprez, F., Jeannot, E., Jgou, Y., Lantri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Primet, P., Quetier, B., Richard, O., Talbi, l.G., Ira, T.: Grid 5000: a large scale and highly reconfigurable experimental grid testbed. Int. Journal of High Performance Computing Applications 20(4), 481–494 (2006)
Article Google Scholar
Georgiou, Y., Richard, O., Capit, N.: Evaluations of the lightweight grid cigri upon the grid 5000 platform. In: Third IEEE International Conference on e-Science and Grid Computing, Washington, DC, USA, pp. 279–286. IEEE Computer Society, Los Alamitos (2007)
Chapter Google Scholar
Litzkow, M., Livny, M., Mutka, M.: Condor - a hunter of idle workstations. In: Proceedings of the 8th International Conference of Distributed Computing Systems (1988)
Google Scholar
Gropp, W., Lusk, E., Thakur, R.: Using MPI-2 Advanced Features of the Message-Passing Interface. The MIT Press, Cambridge (1999)
Google Scholar
Cera, M.C., Pezzi, G.P., Mathias, E.N., Maillard, N., Navaux, P.O.A.: Improving the dynamic creation of processes in MPI-2. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 247–255. Springer, Heidelberg (2006)
Chapter Google Scholar
Bailey, D., Harris, T., Saphir, W., Wijngaart, R.V.D., Woo, A., Yarrow, M.: The nas parallel benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center (1995)
Google Scholar
Li, H., Groep, D.L., Wolters, L.: Workload characteristics of a multi-cluster supercomputer. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 176–193. Springer, Heidelberg (2005)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Universidade Federal do Rio Grande do Sul, Brazil
Márcia C. Cera, Nicolas Maillard & Philippe O. A. Navaux
Laboratoire d’Informatique de Grenoble, France
Yiannis Georgiou & Olivier Richard

Authors

Márcia C. Cera
View author publications
You can also search for this author in PubMed Google Scholar
Yiannis Georgiou
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Richard
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Maillard
View author publications
You can also search for this author in PubMed Google Scholar
Philippe O. A. Navaux
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Science Foundation, 4201 Wilson Blvd, Arlington, 22130, VA, USA
Krishna Kant
Department of Computer Science, The University of Iowa, 52242-1419, Iowa City, IA, USA
Sriram V. Pemmaraju
Department of Computer Science and Engineering, Madras, Indian Institute of Technology (IIT), 600036, Chennai, India
Krishna M. Sivalingam
Department of Computer and Information Science, Temple University, 119122, Philadelphia, PA, USA
Jie Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cera, M.C., Georgiou, Y., Richard, O., Maillard, N., Navaux, P.O.A. (2010). Supporting Malleability in Parallel Architectures with Dynamic CPUSETsMapping and Dynamic MPI. In: Kant, K., Pemmaraju, S.V., Sivalingam, K.M., Wu, J. (eds) Distributed Computing and Networking. ICDCN 2010. Lecture Notes in Computer Science, vol 5935. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11322-2_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-11322-2_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11321-5
Online ISBN: 978-3-642-11322-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Supporting Malleability in Parallel Architectures with Dynamic CPUSETsMapping and Dynamic MPI