Skip to main content

Parallel Checkpointing on a Grid-Enabled Java Platform

  • Conference paper
Book cover Advances in Grid Computing - EGC 2005 (EGC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3470))

Included in the following conference series:

Abstract

This article describes the implementation of checkpointing and recovery services in a Java-based distributed platform. Our case study is suma, a distributed execution platform implemented on top of Grid services. suma has been designed for execution of Java bytecode, with additional support for parallel processing. suma middleware is built on top of commodity software and communication technologies, including Java, Corba, and Globus services. The implementation of suma that runs on top of Globus services is called suma/g.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baker, M., Carpenter, B., Hoon Ko, S., Li, X.: mpiJava: A Java interface to MPI. In: First UK Workshop on Java for High Performance Network Computing, Europar 1998 (1998)

    Google Scholar 

  2. Bouchenak, S.: Making Java applications mobile or persistent. In: Proceedings of 6th USENIX Conference on Object-Oriented Technologies and Systems (2001)

    Google Scholar 

  3. Cardinale, Y., Curiel, M., Figueira, C., García, P., Hernández, E.: Implementation of a corba-based metacomputing system. In: Hertzberger, B., Hoekstra, A.G., Williams, R. (eds.) HPCN-Europe 2001. LNCS, vol. 2110, p. 629. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  4. Cardinale, Y., Hernández, E.: Checkpointing facility in a metasystem. In: Sakellariou, R., Keane, J.A., Gurd, J.R., Freeman, L. (eds.) Euro-Par 2001. LNCS, vol. 2150, p. 75. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  5. Cardinale, Y., Hernández, E.: Parallel checkpointing facility in a metasystem. In: Proceedings of The Parallel Computing Conference, Naples, Italy (2001)

    Google Scholar 

  6. Elnozahy, E.N., Alvisi, L., Wang, Y.-M., Johnson, D.B.: A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys 34(30) (2002)

    Google Scholar 

  7. Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications 15(3) (2001)

    Google Scholar 

  8. Helary, J.M., Mostefaoui, A., Netzer, R., Raynal, M.: Communication-based prevention of useless checkpoints in distributed computations. Technical Report Publication interne n 1105, Institut de Recherche en Informatique et Systemes Aleatoires (May 1997)

    Google Scholar 

  9. Hernández, E., Cardinale, Y., Figueira, C., Teruel, A.: SUMA: A Scientific Metacomputer. In: Parallel Computing: Fundamentals and Applications. Proceedings of The International Conference. Imperial College Press, London (2000)

    Google Scholar 

  10. Manivannan, D., Singhal, M.: Quasi-Synchronous Checkpointing: Models, Characterization, and Classification. IEEE Transactions on Parallel and Distributed Systems 10(7) (1999)

    Google Scholar 

  11. Mostefaoui, A., Raynal, M.: Efficient message logging for uncoordinated checkpointing protocols. Technical Report Publication interne n 1018, Institut de Recherche en Informatique et Systemes Aleatoires (June 1996)

    Google Scholar 

  12. Stellner, G.: Cocheck: Checkpointing and process migration for MPI. In: 10th International Parallel Processing Symposium (1996)

    Google Scholar 

  13. The Globus Alliance. The Globus Toolkit, http://www.globus.org/

  14. The Globus Alliance. The Globus Toolkit, http://www.globus.org/ogsa

  15. von Laszewski, G., Foster, I., Gawor, J., Smith, W., Tuecke, S.: CoG Kits: A Bridge between Commodity Distributed Computing and High-Performance Grids. In: ACM Java Grande 2000 Conference, San Francisco, CA (JUNE 2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cardinale, Y., Hernández, E. (2005). Parallel Checkpointing on a Grid-Enabled Java Platform. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds) Advances in Grid Computing - EGC 2005. EGC 2005. Lecture Notes in Computer Science, vol 3470. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11508380_75

Download citation

  • DOI: https://doi.org/10.1007/11508380_75

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26918-2

  • Online ISBN: 978-3-540-32036-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics