Abstract
In order to construct and deploy massively multiagent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. In this paper, we discuss the issues and propose an approach for fault-tolerance of massively multiagent systems. The starting idea is the application of replication strategies to agents. As criticality of agents may evolve during the course of computation and problem solving, and as resources are bounded, we need to dynamically and automatically adapt the number of replicas of agents, in order to maximize their reliability and availability. We will describe our approach and related mechanisms for evaluating the criticality of a given agent and how to parameterize it (e.g., number of replicas). We also will report on experiments conducted with our prototype architecture (named DarX).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bertier, M., Marin, O., Sens, P.: Implementation and performance evaluation of an adaptable failure detector. In: The International Conference on Dependable Systems and Networks, Washington, USA (2002)
Fedoruk, A., Deters, R.: Improving fault-tolerance by replicating agents. In: AAMAS 2002, Bologna, Italy, pp. 373–744 (2002)
FIPA. Agent communication language, foundation for intelligent physical agents, Geneva, Switzerland (1997), http://www.cselt.stet.it/ufv/leonardo/fipa/index.htm
Golm, M.: Metaxa and the future of reflection. In: OOPSLA -Workshop on Reflective Programming in C++ and Java, pp. 238–256 (1998)
Guerraoui, R., Garbinato, B., Mazouni, K.: Lessons from designing and implementing GARF. In: Briot, J.-P., Geib, J.-M., Yonezawa, A. (eds.) OBPDC 1995. LNCS, vol. 1107, pp. 238–256. Springer, Heidelberg (1996)
Guessoum, Z., Briot, J.-P.: From active objects to autonomous agents. IEEE Concurrency 7(3), 68–76 (1999)
Guessoum, Z., Briot, J.-P., Charpentier, S.: Dynamic and adaptative replication for large-scale reliable multi-agent systems. In: Proceedings of the ICSE 2002 First International Workshop on Software Engineering for Large-Scale Multi-Agent Systems (SELMAS 2002). ACM, New York (2002)
Hagg, S.: A sentinel approach to fault handling in multi-agent systems. In: Dickson, L., Zhang, C. (eds.) DAI 1996. LNCS, vol. 1286, pp. 190–195. Springer, Heidelberg (1997)
Kaminka, G.A., Pynadath, D.V., Tambe, M.: Monitoring teams by overhearing: A multi-agent plan-recognition approach. Journal of Intelligence Artificial Research 17, 83–135 (2002)
Kraus, S., Subrahmanian, V.S., Cihan Tacs, N.: Probabilistically survivable MASs. In: IJCAI 2003, pp. 789–795 (2003)
Odell, J.J., Dyke Parunak, H.V., Bauer, B.: Representing agent interaction protocols in UML. In: Fourth International Conference on Autonomous Agents, pp. 121–140 (2000)
De Assis Silva, F., Popescu-Zeletin, R.: An approach for providing mobile agent fault tolerance. In: Rothermel, K., Hohl, F. (eds.) MA 1998. LNCS, vol. 1477, pp. 14–25. Springer, Heidelberg (1998)
Silva, L., Batista, V., Silva, J.: Fault-tolerant execution of mobile agents. In: International Conference on Dependable Systems and Networks, pp. 135–143 (2000)
van Renesse, R., Birman, K., Maffeis, S.: Horus: A flexible group communication system. Communications of the ACM 39(4), 76–83 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guessoum, Z., Briot, JP., Faci, N. (2005). Towards Fault-Tolerant Massively Multiagent Systems. In: Ishida, T., Gasser, L., Nakashima, H. (eds) Massively Multi-Agent Systems I. MMAS 2004. Lecture Notes in Computer Science(), vol 3446. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11512073_5
Download citation
DOI: https://doi.org/10.1007/11512073_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26974-8
Online ISBN: 978-3-540-31889-7
eBook Packages: Computer ScienceComputer Science (R0)