Abstract
In a fault tolerant system using rollback-recovery protocols, the performance of the system is degraded because of the increment of saved fault tolerance information. To avoid degrading its performance, we propose novel multi-agents based garbage-collection technique that deletes useless fault tolerance information. We define and design a garbage-collection agent for garbage-collection of fault tolerance information, a information agent for management of fault tolerant information, and a facilitator agent for communication between agents. And we propose the garbage-collection algorithm(GCA) using these agents. Our rollback recovery method is based on independent checkpointing protocol and sender based pessimistic message logging protocol. To prove the correctness of the garbage-collection algorithm, we introduce failure injection during operation and compare the domain knowledge of the proposed system using GCA with the domain knowledge of another system without GCA.
This work was supported by grant No. R01-2001-000-00354-0 from the Korea Science & Engineering Foundation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sreenivas, M.V., Bhalla, S.: Garbage Collection in Message Passing Distributed Systems. In: Proceeding of International Symposium on Parallel Algorithms/Architecture Synthesis, pp. 213–218. IEEE Computer Society Press, Los Alamitos (March 1995)
Chung, K.S., Yu, H.-C., Baik, M.-S., Shon, J.G., Hwang, J.-S.: A Garbage Collection of Message logs without Additional Message on Causal Message Logging Protocol. Journal of KISS: Computer System and Theory 28, 7–8 (2001)
Lee, H.-M., Chung, K.-S., Shin, S.-C., Lee, D.-W., Lee, W.-G., Yu, H.-C.: A Recovery Technique Using Multi-agent in Distributed Computing Systems. In: Arbab, F., Talcott, C. (eds.) COORDINATION 2002. LNCS, vol. 2315, pp. 236–249. Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, D.W. et al. (2003). Managing Fault Tolerance Information in Multi-agents Based Distributed Systems. In: Liu, J., Cheung, Ym., Yin, H. (eds) Intelligent Data Engineering and Automated Learning. IDEAL 2003. Lecture Notes in Computer Science, vol 2690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45080-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-45080-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40550-4
Online ISBN: 978-3-540-45080-1
eBook Packages: Springer Book Archive