Skip to main content

Towards Checkpointing Grid Architecture

  • Conference paper
Parallel Processing and Applied Mathematics (PPAM 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3911))

Abstract

Contemporary Grid environments are featured by an increasingly growing virtualization and distribution of resources. Such situations impose greater demands on load-balancing and fault-tolerant capabilities. The checkpoint-restart mechanism seems to be the most intuitive tool that can fulfill the specific requirements. One of the goals of the CoreGRID Network of Excellence is to define the high-level checkpoint-restart Grid Service and to locate it among other Grid Services. We aim to define both the abstract model of that service and the lower layer interface that will allow the service to cooperate with the diverse existing and future checkpoint-restart tools. The paper is the first step leading to achieving this goal. It includes the overall sketch of the architecture of the considered service and its connection with the actual checkpoint-restart tools. Additionally, the work on low-level checkpoint restart tools to be used in the “proof of concept” implementation and integration is mentioned.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://checkpointing.psnc.pl/Progress/psncLibCkpt/

  2. Jankowski, G., Mikolajczak, R., Januszewski, R.: Checkpoint/Restart mechanism for multiprocess applications implemented under SGIGrid Project. In: CGW 2004 (2004)

    Google Scholar 

  3. Litzkow, M., Tannenbaun, T., Basney, J., Livny, M.: Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System, Computer Sciences Department University of Wisconsin-Madison

    Google Scholar 

  4. Libckpt: Transparent Checkpointing under Unix’. In: Conference Proceedings, Usenix Winter 1995 Technical Conference, New Orleans, LA (January 1995)

    Google Scholar 

  5. Kovacs, J., Kacsuk, P.: A migration framework for executing parallel programs in the Grid. In: 2nd European AxGrids Conference, Nicosia, Cyprus, January 28-30, pp. 80–89 (2004)

    Google Scholar 

  6. Next Generation Grid(s), European Grid Research 2005-2010, Expert Group Report, June 16 (2003)

    Google Scholar 

  7. Next Generation Grids 2, Requirements and Options for European Grids Research 2005-2010 and Beyond, Expert Group Report (July 2004)

    Google Scholar 

  8. A Survey of Checkpointing/Restart Implementations, Eric Roman, Lawrence Berkley National Laboratory, CA

    Google Scholar 

  9. Jankowski, G., Mikolajczak, R., Januszewski, R., Meyer, N., Stroinski, M.: Resources Virtualization in Fault-Tolerance and Migration Issues. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3036, pp. 449–452. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. Kacsuk, P., Dozsa, G., Kovacs, J., et al.: P-GRADE: a Grid Programming Environment. Journal of Grid Computing 1(2), 171–197 (2004)

    Article  Google Scholar 

  11. PGRADE Parallel Grid Run-time and Application Development Environment: http://www.lpds.sztaki.hu/pgrade

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jankowski, G., Kovacs, J., Meyer, N., Januszewski, R., Mikolajczak, R. (2006). Towards Checkpointing Grid Architecture. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2005. Lecture Notes in Computer Science, vol 3911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11752578_79

Download citation

  • DOI: https://doi.org/10.1007/11752578_79

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34141-3

  • Online ISBN: 978-3-540-34142-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics