Evaluating fault tolerance approaches in multi-agent systems

Stanković, Rade; Štula, Maja; Maras, Josip

doi:10.1007/s10458-015-9320-6

Evaluating fault tolerance approaches in multi-agent systems

Published: 23 November 2015

Volume 31, pages 151–177, (2017)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Rade Stanković¹,
Maja Štula² &
Josip Maras²

755 Accesses
8 Citations
Explore all metrics

Abstract

A multi-agent system (MAS) is a distributed system that consists of multiple agents working together to solve mutual problems. Even though MASs are well suited for the development of complex distributed systems, the number of real-world usages is still small. One of the main reasons for this is that MASs are very fragile. In a typical, large-scale MAS, the rate of failure grows with the number of hosts, the number of deployed agents, and the duration of the agent’s task execution. For this reason, numerous approaches have been introduced to deal with aspects of failure handling. However, the absence of centralized control and a large number of individual intelligent components makes it difficult to detect and treat errors. The risk of uncontrollable fault propagation is high and can seriously impact on system performance. There are two important factors that limit the usage of MASs: (1) existing fault tolerance (FT) approaches are not generic, as they focus on and improve specific issues of FT; and (2) despite the plethora of available FT approaches and theories, there is a remarkable lack of general metrics, tools, benchmarks, and experimental methods for formal validation and comparison of existing or newly developed FT approaches. As FT approaches in MASs become a well-established field, the need for generalized, standardized evaluation of FT approaches emerges as imperative. In this paper, we first present a detailed overview of existing FT solutions, approaches, and techniques in agent platform hosted MASs. From that overview, we derive the commonalities in existing research. Next, we present the main contribution of our paper: an evaluation methodology, with a set of metrics, for comparing FT approaches in MASs. We adopt an engineering perspective on the problem, defining a methodology and metrics that are both implementation- and domain-independent. The metrics are formalized with an acyclic directed graph. By using our methodology, evaluators can select an appropriate FT approach for targeted MAS application, thus improving MAS usability, stability, and development speed. In order to show the viability of our approach, a case study that compares two FT approaches for a targeted MAS is presented. The case study results show that our methodology can be used for selecting an appropriate FT approach for the targeted MAS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Tanenbaum, A. S., & Steen, M. V. (2002). Distributed systems: principles and paradigms. Upper Saddle River: Prentice Hall.
MATH Google Scholar
Bellifemine, F. L., Caire, G., & Greenwood, D. (2007). Developing multi-agent systems with JADE. West Sussex: Wiley.
Book Google Scholar
Rudowsky, I. (2004). Intelligent agents. The Communications of the Association for Information Systems, 14(1), 48.
Google Scholar
Wooldridge, M. (1997). Agent-based software engineering. IEE Proceedings Software, 144(1), 26–37.
Article Google Scholar
Decker, K. S., & Sycara, K. (1997). Intelligent adaptive information agents. Journal of Intelligent Information Systems, 9(3), 239–260.
Article Google Scholar
Punithavathi, R., & Duraiswamy, K. (2010). A fault tolerant mobile agent information retrieval system. Journal of computer science, 6(5), 553.
Article Google Scholar
Jurasovic, K., Kusek, M., & Jezic, G. (2009). Multi-agent service deployment in telecommunication networks. Agent and multi-agent systems: technologies and applications (pp. 560–569). Berlin: Springer.
Chapter Google Scholar
Yang, Z., Ma, C., Feng, J. Q., Wu, Q. H., Mann, S., & Fitch, J. (2006). A multi-agent framework for power system automation. International Journal of Innovations in Energy Systems and Power, 1(1), 39–45.
Google Scholar
Zhang, Z., McCalley, J. D., Vishwanathan, V., & Honavar, V. (June, 2004). Multiagent system solutions for distributed computing, communications, and data integration needs in the power industry. In Power Engineering Society General Meeting, 2004, IEEE (pp. 45-49). IEEE.
Fedoruk, A., & Deters, R. (July, 2002). Improving fault-tolerance by replicating agents. In Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems: Part 2 (pp. 737–744). ACM.
Batouma, N., & Sourrouille, J. L. (2011). Dynamic adaption of resource aware distributed applications. International journal of grid and distributed computing, 4(2), 25–42.
Google Scholar
Anon. (2002). SLA Information Zone. http://www.sla-zone.co.uk/. Accessed June 28, 2014.
Ahmad, H. F., Sun, G., & Mori, K. (2001). Autonomous information provision to achieve reliability for users and providers. In Proceedings. 5th International Symposium on Autonomous Decentralized Systems, 2001 (pp. 65–72). IEEE.
Ahmad, H. F., & Suguri, H. (April, 2003). Dynamic information allocation through mobile agents to achieve load balancing in evolving environment. In The Sixth International Symposium on Autonomous Decentralized Systems, 2003. ISADS 2003 (pp. 25-33). IEEE.
Huhns, M. N., et al. (2005). Research directions for service-oriented multiagent systems. IEEE Internet Computing, 9(6), 65.
Article Google Scholar
Calisti, M., et al. (2010). Service-oriented architectures and multi-agent systems technology. Dagstuhl Seminar Proceedings (p. 10021).
Briot, J. P., & Ghédira, K. (2003). Déploiement des systemes multi-agents-Vers un passagea l’échelle-JFSMA’03. Revue des Sciences et Technologies de l’Information (RSTI).
Kumar, S., & Cohen, P. R. (June, 2000). Towards a fault-tolerant multi-agent system architecture. In Proceedings of the Fourth International Conference on Autonomous Agents (pp. 459-466). ACM.
Almeida, A. L., Aknine, S., Briot, J. P., & Malenfant, J. (April, 2006). Plan-based replication for fault-tolerant multi-agent systems. In Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International (p. 7). IEEE.
Isong, B. E., & Bekele, E. (2013). A systematic review of fault tolerance in mobile agents. American Journal of Software Engineering and Applications, 2(5), 111–124.
Article Google Scholar
Stanković, R., & Štula, M. (February, 2013). Fault tolerance through interaction and mutual cooperation in hierarchical multi-agent systems. In 5th International Conference on Agents and Artificial Intelligence.
Marin, O. (2003). The Darx framework: Adapting fault tolerance for agent systems (Doctoral dissertation, Université Paris VI).
Tosic, M., & Zaslavsky, A. (2005). Reliable multi-agent systems with persistent publish/subscribe messaging. Innovations in applied artificial intelligence (pp. 165–174). Berlin: Springer.
Chapter Google Scholar
Kumar, S., Cohen, P. R., & Levesque, H. J. (2000). The adaptive agent architecture: Achieving fault-tolerance using persistent broker teams. In Proceedings. Fourth International Conference on MultiAgent Systems, 2000 (pp. 159–166). IEEE.
Faci, N., Guessoum, Z., & Marin, O. (May, 2006). DimaX: A fault-tolerant multi-agent platform. In Proceedings of the 2006 International Workshop on Software Engineering for Large-Scale Multi-Agent Systems (pp. 13–20). ACM.
Mitrovic, D., Budimac, Z., Ivanovic, M., & Vidakovic, M. (October, 2010). Improving fault-tolerance of distributed multi-agent systems with mobile network-management agents. In Proceedings of the 2010 International Multiconference on Computer Science and Information Technology (IMCSIT)(pp. 217–222). IEEE.
Summiya, S., Ijaz, K., Manzoor, U., & Ali Shahid, A. (November, 2006). A fault tolerant infrastructure for mobile agent. In Proceedings of the International Conference on Computational Intelligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce (p. 235). IEEE Computer Society.
Yang, J., Cao, J., Wu, W., & Xu, C. Z. (2005). Parallel algorithms for fault-tolerant mobile agent execution. Distributed and parallel computing (pp. 246–256). Berlin: Springer.
Chapter Google Scholar
Jin, G., Ahn, B., & Lee, K. D. (2004). A fault-tolerant protocol for mobile agent. Computational science and its applications-ICCSA 2004 (pp. 993–1001). Berlin: Springer.
Chapter Google Scholar
Johansen, D., Marzullo, K., Schneider, F. B., Jacobsen, K., & Zagorodnov, D. (1999). NAP: Practical fault-tolerance for itinerant computations. In 19th IEEE International Conference on Distributed Computing Systems, 1999. Proceedings (pp. 180–189). IEEE.
Klügl, F. (2008). Measuring complexity of multi-agent simulations—An attempt using metrics. Languages, methodologies and development tools for multi-agent systems (pp. 123–138). Berlin: Springer.
Chapter Google Scholar
Wille, C., Brehmer, N., & Dumke, R. R. (2004). Software measurement of agent-based systems an evaluation study of the agent academy. Technical Report Preprint No. 3, Faculty of Informatics, University of Magdeburg.
Such, J. M., Alberola, J. M., Mulet, L., Espinosa, A., Garcia-Fornes, A., & Botti, V. (2007). Large-scale multiagent platform benchmarks. In Languages, Methodologies and Development Tools for Multi-Agent Systems (LADS 2007). Proceedings of the Multi-Agent Logics, Languages, and Organisations-Federated Workshops (pp. 192–204).
Kusek, K., Jurasovic, G., & Jezic, M. (2006). A performance analysis of multi-agent systems. International Transactions on Systems Science and Applications, 1(4).
Alberola, J. M., Such, J. M., Garcia-Fornes, A., Espinosa, A., & Botti, V. (2010). A performance evaluation of three multiagent platforms. Artificial Intelligence Review, 34(2), 145–176.
Article Google Scholar
Mulet, L., Such, J. M., & Alberola, J. M. (May, 2006). Performance evaluation of open-source multiagent platforms. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (pp. 1107–1109). ACM.
Fernández, V., Grimaldo, F., Lozano, M., & Orduna, J. M. (2010). Evaluating Jason for distributed crowd simulations. In ICAART (2) (pp. 206–211).
Pérez-Carro, P., Grimaldo, F., Lozano, M., & Orduna, J. M. (2014). Characterization of the Jason multiagent platform on multicore processors. Scientific Programming, 22(1), 21–35.
Article Google Scholar
Silva, L. M., Soares, G., Martins, P., Batista, V., & Santos, L. (2000). Comparing the performance of mobile agent systems: A study of benchmarking. Computer Communications, 23(8), 769–778.
Article Google Scholar
Krippendorff, K. (1986). A dictionary of cybernetics. Norfolk: The American Society for Cybernetics.
Google Scholar
Aprameya Rao, I. V., Jain, M., & Karlapalem, K. (May, 2007). Towards simulating billions of agents in thousands of seconds. In Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems (p. 143). ACM.
Cardoso, R. C., Hübner, J. F., & Bordini, R. H. (2013). Benchmarking communication in actor-and agent-based languages. Engineering multi-agent systems (pp. 58–77). Berlin: Springer.
Chapter Google Scholar
Wilensky, U., (1999). NetLogo Home Page http://ccl.northwestern.edu/netlogo/. Accessed May 06, 2015.
Dimou, C., Symeonidis, A. L., & Mitkas, P. (April, 2007). Towards a generic methodology for evaluating MAS performance. In International Conference on Integration of Knowledge Intensive Multi-Agent Systems, 2007. KIMAS 2007 (pp. 174–179). IEEE.
Zadeh, L. A. (2002). In quest of performance metrics for intelligent systems—A challenge that cannot be met with existing methods. CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE.
Evans, J. M., & Messina, E. R. (2001). Performance metrics for intelligent systems. NIST SPECIAL PUBLICATION SP (pp. 101–104).
Fenton, N., & Bieman, J. (2014). Software metrics: A rigorous and practical approach. Boca Raton: CRC Press.
Book MATH Google Scholar
Hu, X., & Zeigler, B. P. (2004). Measuring cooperative robotic systems using simulation-based virtual environment. DE LA SALLE UNIV MANILA (PHILIPPINES) COLLEGE OF BUSINESS AND ECONOMICS.
Nelson, A., Grant, E., & Henderson, T. (2002). Competitive relative performance evaluation of neural controllers for competitive game playing with teams of real mobile robots. NIST SPECIAL PUBLICATION SP, 43–50.
Scholtz, J., Antonishek, B., & Young, J. (2004). Evaluation of human–robot interaction in the NIST reference search and rescue test arenas. In Proceedings in the Performance Metrics for Intelligent Systems (PerMIS ’04).
Nowosielski, R., Gerlach, L., Payá-Vayá, G., Hesselbarth, S., & Blume, H. Methodology for observation and evaluation of fault tolerance implementations inside high temperature ASICs.
McCann, J. A., & Huebscher, M. C., (January, 2004). Evaluation issues in autonomic computing. In Grid and Cooperative Computing-GCC 2004 Workshops (pp. 597–608). Berlin: Springer.
Wooldridge, M., Jennings, N. R., & Kinny, D. (2000). The Gaia methodology for agent-oriented analysis and design. Autonomous Agents and multi-agent systems, 3(3), 285–312.
Article Google Scholar
Zambonelli, F., Jennings, N. R., & Wooldridge, M. (2003). Developing multiagent systems: The Gaia methodology. ACM Transactions on Software Engineering and Methodology (TOSEM), 12(3), 317–370.
Article Google Scholar
Deloach, S. (2004). The MaSE methodology. Methodologies and Software Engineering for Agent Systems-The Agent-Oriented Software Engineering Handbook Series: Multiagent Systems, Artificial Societies, and Simulated Organizations, 11, 107–125.
Article Google Scholar
Bresciani, P., Perini, A., Giorgini, P., Giunchiglia, F., & Mylopoulos, J. (2004). Tropos: An agent-oriented software development methodology. Autonomous Agents and Multi-Agent Systems, 8(3), 203–236.
Article MATH Google Scholar
Elammari, M., & Lalonde, W. (June, 1999). An agent-oriented methodology: High-level and intermediate models. In Proceedings of the 1st International Workshop on Agent-Oriented Information Systems (pp. 1–16).
Padgham, L., & Winikoff, M. (2003). Prometheus: A methodology for developing intelligent agents. Agent-oriented software engineering III (pp. 174–185). Berlin: Springer.
Chapter Google Scholar
Bauer, B., & Odell, J. (2005). UML 2.0 and agents: How to build agent-based systems with the new UML standard. Engineering Applications of Artificial Intelligence, 18(2), 141–157.
Article Google Scholar
Mellouli, S. (2005). FATMAS: A methodology to design fault-tolerant multi-agent systems (Doctoral dissertation, Université Laval).
Abdelaziz, T., Elammari, M., Unland, R., & Branki, C. (2010). MASD: Multi-agent systems development methodology. Multiagent and Grid Systems, 6(1), 71–101.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Siemens CVC, Put Brodarice 6, 21000, Split, Croatia
Rade Stanković
Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture, FESB, University of Split, R Boškovića 32, 21000, Split, Croatia
Maja Štula & Josip Maras

Authors

Rade Stanković
View author publications
You can also search for this author in PubMed Google Scholar
Maja Štula
View author publications
You can also search for this author in PubMed Google Scholar
Josip Maras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rade Stanković.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stanković, R., Štula, M. & Maras, J. Evaluating fault tolerance approaches in multi-agent systems. Auton Agent Multi-Agent Syst 31, 151–177 (2017). https://doi.org/10.1007/s10458-015-9320-6

Download citation

Published: 23 November 2015
Issue Date: January 2017
DOI: https://doi.org/10.1007/s10458-015-9320-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating fault tolerance approaches in multi-agent systems

Abstract

Access this article

Similar content being viewed by others

Digital Twin: Mitigating Unpredictable, Undesirable Emergent Behavior in Complex Systems

A brief introduction to distributed systems

Quality Assurance for AI-Based Systems: Overview and Challenges (Introduction to Interactive Session)

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evaluating fault tolerance approaches in multi-agent systems

Abstract

Access this article

Similar content being viewed by others

Digital Twin: Mitigating Unpredictable, Undesirable Emergent Behavior in Complex Systems

A brief introduction to distributed systems

Quality Assurance for AI-Based Systems: Overview and Challenges (Introduction to Interactive Session)

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation