article

Adaptive replication of large-scale multi-agent systems: towards a fault-tolerant multi-agent platform

Authors:

Zahia Guessoum,

Jean-Pierre BriotAuthors Info & Claims

ACM SIGSOFT Software Engineering Notes, Volume 30, Issue 4

Pages 1 - 6

https://doi.org/10.1145/1082983.1082977

Published: 15 May 2005 Publication History

Abstract

In order to construct and deploy large-scale multi-agent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. This means that fault-tolerance is an inevitable issue for large-scale multi-agent systems. In this paper, we discuss the issues and propose an approach for fault-tolerance of multi-agent systems. The starting idea is the application of replication strategies to agents, the most critical agents being replicated to prevent failures. As criticality of agents may evolve during the course of computation and problem solving, and as resources are bounded, we need to dynamically and automatically adapt the number of replicas of agents, in order to maximize their reliability and availability. We will describe our approach and related mechanisms for evaluating the criticality of a given agent (based on application-level semantic information, e.g. interdependences, and also system-level statistical information, e.g., communication load) and for deciding what strategy to apply (e.g., active replication, passive) how to parameterize it (e.g., number of replicas). We also will report on experiments conducted with our prototype architecture (named DimaX).

References

[1]

M. Bertier, O. Marin, and P. Sens. Implementation and performance evaluation of an adaptable failure detector. In the International Conference on Dependable Systems and Networks, Washington, USA, 2002.

Digital Library

[2]

C. Castelfranchi. Decentralized AI, chapter Dependence relations in multi-agent systems. Elsevier, 1992.

[3]

M. Colombetti and M. Verdicchio. An analysis of agent speech acts as institutional actions. In Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pages 1157--1164. ACM Press, 2002.

Digital Library

[4]

A. Fedoruk and R. Deters. Improving fault-tolerance by replicating agents. In AAMAS2002, pages 373--744, Bologna, Italy, 2002.

Digital Library

[5]

R. Guerraoui, B. Garbinato, and K. Mazouni. Lessons from designing and implementing GARF. In Object-Based Parallel and Distributed Computation, number 791 in LNCS, pages 238--256, 1995.

Digital Library

[6]

Z. Guessoum and J.-P. Briot. From active objects to autonomous agents. IEEE Concurrency, 7(3):68--76, 1999.

Digital Library

[7]

S. Hagg. A sentinel approach to fault handling in multi-agent systems. In C. Zhang and D. Lukose, editors, Multi-Agent Systems, Methodologies and Applications, number 1286 in LNCS, pages 190--195, 1997.

[8]

N. Jamali, P. Thati, and G. Agha. An actor-based architecture for customizing and controlling agent ensembles. IEEE Intelligent Systems, Special Issue on Agents, 1999.

Digital Library

[9]

G. A. Kaminka, D. V. Pynadath, and M. Tambe. Monitoring teams by overhearing: A multi-agent plan-recognition approach. Journal of Intelligence Artificial Research, 17:83--135, 2002.

Digital Library

[10]

S. Kraus, V. Subrahmanian, and N. C. Tacs. Probabilistically survivable MASs. In IJCAI'03, pages 789--795, 2003.

Digital Library

[11]

O. Marin, M. Bertier, and P. Sens. DARX - a framework for the fault-tolerant support of agent software. In 14th International Symposium on Software Reliability Engineering (ISSRE'2003), pages 406--417, Denver, Colorado, USA, 2003. IEEE.

Digital Library

[12]

J. S. Sichman and R. Conte. Multi-agent dependence by dependence graphs. In AAMAS2002, pages 483--490, Bologna, Italy, 2002. ACM.

Digital Library

[13]

J. S. Sichman, R. Conte, and Y. Demazeau. Reasoning about others using dependence networks. In Actes de Incontro del gruppo AI*IA di interesse speciale sul inteligenza artificiale distribuita, Roma, Italia, 1993.

[14]

J. S. Sichman, R. Conte, and Y. Demazeau. A social reasoning mechanism based on dependence networks. In Proceedings of ECAI'94 - European Conference on Artificial Intelligence, Amsterdam, The Netherlands, August 1994.

[15]

F. D. A. Silva and R. Popescu-Zeletin. An approach for providing mobile agent fault tolerance. In S. N. Maheshwari, editor, Second International Workshop on Mobile Agents, number 1477 in LNCS, pages 14--25. Springer Verlag, 1998.

Digital Library

[16]

L. Silva, V. Batista, and J. Silva. Fault-tolerant execution of mobile agents. In International Conference on Dependable Systems and Networks, pages 135--143, 2000.

Digital Library

[17]

R. van Renesse, K. Birman, and S. Maffeis. Horus: A flexible group communication system. Communications of the ACM, 39(4):76--83, 1996.

Digital Library

[18]

L. A. Zadeh. A new direction in ai: Toward a computational theory of perceptions. AI Magazine, 22(1):73--84, 2001.

Digital Library

Cited By

BORA ŞDİKENELLİ O(2016)A centralized self-adaptive fault tolerance approach based on feedback control for multiagent systemsTURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES10.3906/elk-1405-5824(4707-4723)Online publication date: 2016
https://doi.org/10.3906/elk-1405-58
Schatten MŠeva JTomičić I(2016)A roadmap for scalable agent organizations in the Internet of EverythingJournal of Systems and Software10.1016/j.jss.2016.01.022115:C(31-41)Online publication date: 1-May-2016
https://dl.acm.org/doi/10.1016/j.jss.2016.01.022
Vidaković MIvanović MMitrović DBudimac Z(2013)Extensible Java EE-Based Agent Framework – Past, Present, FutureMultiagent Systems and Applications10.1007/978-3-642-33323-1_3(55-88)Online publication date: 2013
https://doi.org/10.1007/978-3-642-33323-1_3
Show More Cited By

Index Terms

Adaptive replication of large-scale multi-agent systems: towards a fault-tolerant multi-agent platform
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Multi-agent systems
2. Software and its engineering
  1. Software organization and properties
    1. Extra-functional properties
      1. Software fault tolerance

Recommendations

Adaptive replication of large-scale multi-agent systems: towards a fault-tolerant multi-agent platform
SELMAS '05: Proceedings of the fourth international workshop on Software engineering for large-scale multi-agent systems

In order to construct and deploy large-scale multi-agent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. This means that fault-tolerance is an inevitable issue for large-scale multi-...
Adaptive replication of large-scale multi-agent systems – towards a fault-tolerant multi-agent platform
Software Engineering for Multi-Agent Systems IV

In order to construct and deploy large-scale multi-agent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. This means that fault-tolerance is an inevitable issue for large-scale multi-...
Dynamic and adaptive replication for large-scale reliable multi-agent systems
Software engineering for large-scale multi-agent systems

In order to make large-scale multi-agent systems reliable, we propose an adaptive application of replication strategies. Critical agents are replicated to avoid failures. As criticality of agents may evolve during the course of computation and problem ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGSOFT Software Engineering Notes

ACM SIGSOFT Software Engineering Notes Volume 30, Issue 4

July 2005

1514 pages

ISSN:0163-5948

DOI:10.1145/1082983

Issue’s Table of Contents

SELMAS '05: Proceedings of the fourth international workshop on Software engineering for large-scale multi-agent systems
May 2005
92 pages
ISBN:1595931163
DOI:10.1145/1082960

Copyright © 2005 Copyright is held by the owner/author(s).

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 May 2005

Published in SIGSOFT Volume 30, Issue 4

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
605
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

BORA ŞDİKENELLİ O(2016)A centralized self-adaptive fault tolerance approach based on feedback control for multiagent systemsTURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES10.3906/elk-1405-5824(4707-4723)Online publication date: 2016
https://doi.org/10.3906/elk-1405-58
Schatten MŠeva JTomičić I(2016)A roadmap for scalable agent organizations in the Internet of EverythingJournal of Systems and Software10.1016/j.jss.2016.01.022115:C(31-41)Online publication date: 1-May-2016
https://dl.acm.org/doi/10.1016/j.jss.2016.01.022
Vidaković MIvanović MMitrović DBudimac Z(2013)Extensible Java EE-Based Agent Framework – Past, Present, FutureMultiagent Systems and Applications10.1007/978-3-642-33323-1_3(55-88)Online publication date: 2013
https://doi.org/10.1007/978-3-642-33323-1_3
Potiron KEl Fallah Seghrouchni ATaillibert PPotiron KEl Fallah Seghrouchni ATaillibert P(2013)ConclusionFrom Fault Classification to Fault Tolerance for Multi-Agent Systems10.1007/978-1-4471-5046-6_7(75-80)Online publication date: 22-Mar-2013
https://doi.org/10.1007/978-1-4471-5046-6_7
Potiron KEl Fallah Seghrouchni ATaillibert PPotiron KEl Fallah Seghrouchni ATaillibert P(2013)Multi-Agent System PropertiesFrom Fault Classification to Fault Tolerance for Multi-Agent Systems10.1007/978-1-4471-5046-6_2(5-10)Online publication date: 22-Mar-2013
https://doi.org/10.1007/978-1-4471-5046-6_2
Zhang YManisterski EKraus SSubrahmanian VPeleg D(2009)Computing the fault tolerance of multi-agent deploymentArtificial Intelligence10.1016/j.artint.2008.11.007173:3-4(437-465)Online publication date: 1-Mar-2009
https://dl.acm.org/doi/10.1016/j.artint.2008.11.007
de C. Gatti Mde Carvalho Gde Paes Rde Lucena CBriot J(2007)On Fault Tolerance in Law-Governed Multi-agent SystemsSoftware Engineering for Multi-Agent Systems V10.1007/978-3-540-73131-3_1(1-20)Online publication date: 2007
https://doi.org/10.1007/978-3-540-73131-3_1
de C. Gatti Mde Lucena CBriot JChoren RGarcia AGiese HLeung HLucena CRomanovsky A(2006)On fault tolerance in law-governed multi-agent systemsProceedings of the 2006 international workshop on Software engineering for large-scale multi-agent systems10.1145/1138063.1138068(21-28)Online publication date: 22-May-2006
https://dl.acm.org/doi/10.1145/1138063.1138068
Briot JGuessoum ZAknine SAlmeida AMalenfant JMarin OSens PFaci NGatti MLucena CCheng Bde Lemos RFickas SGarlan DMagee JMüller HTaylor R(2006)Experience and prospects for various control strategies for self-replicating multi-agent systemsProceedings of the 2006 international workshop on Self-adaptation and self-managing systems10.1145/1137677.1137685(37-43)Online publication date: 21-May-2006
https://dl.acm.org/doi/10.1145/1137677.1137685
Ding WEngel WGoode ASantostasi G(2016)Declarative modeling cases of cyber physical systems2016 International Conference on Logistics, Informatics and Service Sciences (LISS)10.1109/LISS.2016.7854561(1-6)Online publication date: Jul-2016
https://doi.org/10.1109/LISS.2016.7854561
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents