Skip to main content
Log in

Fault-Management in P2P-MPI

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

We present in this paper a study on fault management in a grid middleware. The middleware is our home-grown software called P2P-MPI. This framework is MPJ compliant, allows users to execute message passing parallel programs, and its objective is to support environments using commodity hardware. Hence, running programs is failure prone and a particular attention must be paid to fault management. The fault management covers two issues: fault-tolerance and fault detection. Fault-tolerance deals with the program execution: P2P-MPI provides a transparent fault tolerance facility based on replication of computations. Fault detection concerns the monitoring of the program execution by the system. The monitoring is done through a distributed set of modules called failure detectors. The contribution of this paper is twofold. The first contribution is the evaluation of the failure probability of an application depending on the replication degree. The failure probability depends on the execution length, and we propose a model to evaluate the duration of a replicated parallel program. Then, we give an expression of the replication degree required to keep the failure probability of an execution under a given threshold. The second contribution is a study of the advantages and drawbacks of several fault detection systems found in the literature. The criteria of our evaluation are the reliability of the failure detection service and the failure detection speed. We retain the binary round-robin protocol for its failure detection speed, and we propose a variant of this protocol which is more reliable than the application execution in any case. Experiments involving of up to 256 processes, carried out on Grid’5000, show that the real detection times closely match the predictions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alvisi, L., Marzullo, K.: Message logging: pessimistic, optimistic, and causal. In: Proceeding of the 15th International Conference on Distributed Computing Systems (ICDCS’95), pp. 229–236 (1995)

  2. Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of AFIPS 1967 Spring Joint Computer Conference, vol. 30, pp. 483–485, (1967)

  3. Baker, M., Carpenter, B., Shafi, A.: MPJ express: towards thread safe java HPC. In: CLUSTER. IEEE (2006)

  4. Batchu R., Dandass Y.S., Skjellum A., Beddhu M.: MPI/FT: a model-based approach to low-overhead fault tolerant message-passing middleware. Clust. Comput. 7(4), 303–315 (2004)

    Article  Google Scholar 

  5. Bornemann, M., van Nieuwpoort, R.V., Kielmann, T.: MPJ/Ibis: a flexible and efficient message passing platform for java. In: Euro PVM/MPI 2005 (2005)

  6. Bouteiller, A., Cappello, F., Hérault, T., Krawezik, G., Lemarinier, P., Magniette, F.: Mpich-v2: a fault tolerant mpi for volatile nodes based on pessimistic sender based message logging. In: Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, p. 25. ACM (2003)

  7. Cappello F., Djilali S., Fedak G., Hérault T., Magniette F., Néri V., Lodygensky O.: Computing on large-scale distributed systems: Xtremweb architecture, programming models, security, tests and convergence with grid. Future Generation Comp. Syst. 21(3), 417–437 (2005)

    Article  Google Scholar 

  8. Carpenter B., Getov V., Judd G., Skjellum A., Fox G.: Mpj: Mpi-like message passing for java. Concurr. Pract. Exp. 12(11), 1019–1038 (2000)

    Article  MATH  Google Scholar 

  9. Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  10. Char B.W., Geddes K.O., Gonnet G.H., Monagan M.B., Watt S.M.: MAPLE reference manual. University of Waterloo, Waterloo Maple Software, Waterloo (1989)

    Google Scholar 

  11. Cirne W., Brasileiro F.V., Andrade N., Costa L., Andrade A., Novaes R., Mowbray M.: Labs of the world, unite!!!. J. Grid Comput. 4(3), 225–246 (2006)

    Article  MATH  Google Scholar 

  12. Défago X., Schiper A., Urbán P.: Total order broadcast and multicast algorithms: taxonomy and survey. ACM Comput. Surv. 36(4), 372–421 (2004)

    Article  Google Scholar 

  13. Felber, P., Defago, X., Guerraoui, R., Oser, P.: Failure detectors as first class objects. In: Proceeding of the 9th IEEE Intl. Symposium on Distributed Objects and Applications (DOA’99), pp. 132–141, (1999)

  14. Genaud S., Rattanapoka C.: P2P-MPI: a peer-to-peer framework for robust execution of message passing parallel programs on grids. J. Grid Comput. 5(1), 27–42 (2007)

    Article  Google Scholar 

  15. Nurmi D., Brevik J., Wolski R.: Modeling machine availability in enterprise and wide-area distributed computing environments. In: Cunha, J.C., Medeiros, P.D. (eds) Euro-Par, volume 3648 of Lecture Notes in Computer Science, pp. 432–441. Springer, Berlin (2005)

    Google Scholar 

  16. Ranganathan S., George A.D., Todd R.W., Chidester M.C.: Gossip-style failure detection and distributed consensus for scalable heterogeneous clusters. Clust. Comput. 4(3), 197–209 (2001)

    Article  Google Scholar 

  17. Renesse, R.V., Minsky, Y., Hayden, M.: A gossip-style failure detection service. In: IFIP International Conference on Distributed Systems Platforms and Open Distributed Middleware, pp. 55–70, England, (1998)

  18. Sankaran S., Squyres J.M., Barrett B., Lumsdaine A., Duell J., Hargrove P., Roman E.: The LAM/MPI checkpoint/restart framework: system-initiated checkpointing. Int. J. High Perform. Comput. Appl. 19(4), 479–493 (2005)

    Article  Google Scholar 

  19. Schneider F.B.: Replication management using the state machine approach, Chapter 7, pp. 169–195. ACM Press, New York (1993)

    Google Scholar 

  20. Shudo, K., Tanaka, Y., Sekiguchi, S.: P3: P2P-based middleware enabling transfer and aggregation of computational resource. In: 5th International Workshop on Global and Peer-to-Peer Computing. IEEE, (2005)

  21. Snir M., Otto S.W., Walker D.W., Dongarra J., Huss-Lederman S.: MPI: the complete reference. MIT Press, Cambridge (1995)

    Google Scholar 

  22. Stellner, G.: CoCheck: checkpointing and process migration for MPI. In: Proceedings of the 10th International Parallel Processing Symposium (IPPS’96), pp. 526–531 (1996)

  23. Nieuwpoort R., Maassen J., Wrzesinska G., Hofman R.F.H., Jacobs C.J.H., Kielmann T., Bal H.E.: Ibis: a flexible and efficient java-based grid programming environment. Concurr. Pract. Exp. 17(7-8), 1079–1107 (2005)

    Article  Google Scholar 

  24. Walters J.P., Chaudhary V.: A scalable asynchronous replication-based strategy for fault tolerant MPI applications. In: Aluru, S., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) HiPC, volume 4873 of Lecture Notes in Computer Science, pp. 257–268. Springer, Berlin (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stéphane Genaud.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Genaud, S., Jeannot, E. & Rattanapoka, C. Fault-Management in P2P-MPI. Int J Parallel Prog 37, 433–461 (2009). https://doi.org/10.1007/s10766-009-0115-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-009-0115-8

Keywords

Navigation