Abstract
A metasystem allows seamless access to a collection of distributed computational resources. Checkpointing is an important service in high throughput computing, especially for process migration and recovery after system crash. This article describes the experiences on incorporating checkpointing and recovery facilities in a Java-based metasystem. Our case study is suma, a metasystem for execution of Java byte-code, both sequential and parallel. This paper also shows preliminary results on checkpointing and recovery overhead for single-node applications.
This work was partially supported by grants from Conicit (project S1-2000000623) and from Universidad Simón Bolívar (direct support for research group GID-25)
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
A. Baratloo, M. Karaul, Z. M. Kedem, and P. Wyckoff. Charlotte: Metacomputing on the web. Future Generation Computer Systems, 15(5–6):559–570, Octuber 1999.
S. Bouchenak. Making Java applications mobile or persistent. In Proceedings of 6th USENIX Conference on Object-Oriented Technologies and Systems (COOTS’01), January 2001.
T. Brench, H. Sandhu, M. Shan, and J. Talbot. ParaWeb: Towards world-wide supercomputing. In Proceedings of the 7th ACM SIGOPS European Wor shop, 1996.
J. Eliot, B. Moss, and T. Hosking. Approaches to adding persistence to Java. In Proceedings of the First International Workshop on Persistence and Java, September 1996.
S. Funfrocken. Transparent migration of Java-based mobile agents (capturing and reestablishing the state of Java programs). Proceedings of Second International Workshop Mobile Agents 98 (MA’98), September 1998.
Jon Howell. Straightforward Java persistence through checkpointing. In Advances in Persistent Object Systems, pages 322–334, 1999.
Michael O. Neary, Bernd O. Christiansen, Peter Capello, and Klaus E. Schauser. Javelin: Parallel computing on the internet. Future Generation Computer Systems, 15(5–6):659–674, Octuber 1999.
J. Plank and M. Puening. Checkpointing Java. http://www.cs.utk.edu~plank/javackp.html.
T. Printezis, M. Atkinson, L. Daynes, S. Spence, and P. Bailey. The design of a new persistent object store for pjama. In Proceedings of the Second International Workshop on Persistence and Java, August, 1997.
T. Sakamoto, T. Sekiguchi, and A. Yonezawa. Bytecode transformation for portable thread migration in Java. Proceedings of Second International Workshop Mobile Agents 2000 (MA’2000), 1(3):123–137, September 2000.
T. Suezawa. Persistent execution state of a Java Virtual Machine. Proceedings of the ACM 2000 Java Grande Conference, June 2000.
H. Takagi, S. Matsouka, H. Nakada, S. Sekiguchi, M. Satoh, and U. Nagashima. Ninflet: a migratable parallel object framework using Java. In in Proc. of the ACM 1998 Worshop on Java for High-Performance Network Computing, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cardinale, Y., Hernández, E. (2001). Checkpointing Facility on a Metasystem. In: Sakellariou, R., Gurd, J., Freeman, L., Keane, J. (eds) Euro-Par 2001 Parallel Processing. Euro-Par 2001. Lecture Notes in Computer Science, vol 2150. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44681-8_12
Download citation
DOI: https://doi.org/10.1007/3-540-44681-8_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42495-6
Online ISBN: 978-3-540-44681-1
eBook Packages: Springer Book Archive