Abstract
Emergent effects can arise in multi-agent systems (MAS) where execution is decentralized and reliant on local information. These effects may range from minor deviations in behavior to catastrophic system failures. To formally define these effects, we identify misalignments between the global inherent specification (the true specification) and its local approximation (such as the configuration of different reward components or observations). Using established safety terminology, we develop a framework to understand these emergent effects. To showcase the resulting implications, we use two broadly configurable exemplary gridworld scenarios, where insufficient specification leads to unintended behavior deviations when derived independently. Recognizing that a global adaptation might not always be feasible, we propose adjusting the underlying parameterizations to mitigate these issues, thereby improving the system’s alignment and reducing the risk of emergent failures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In practice, AI components might be prone to solving simple tasks in exceedingly complex ways, which is directly associated with their perceived creativity, causing unconventional solutions, surprising to the user [19].
- 2.
Note that instead of treating the environment as a different entity, we might as well model the environment as another agent in this formalism.
- 3.
We exemplify this process in Sect. 4 with a simple gridworld navigation task.
- 4.
Note that in most settings, the joint policy returns a tuple of the actions returned by all individual policies. However, since in the general (partially observable) case we also need to adjust the observations passed on to each individual policy from the global state, we again use the \(\otimes \) operator here and overload it to not only handle component composition but also state information decomposition and action composition, which are not inherently identical tasks but—as we argue—closely related tasks nonetheless.
- 5.
Note that the perhaps even more common issue for developing any system is that we rarely have a perfect specification. We thus require an even earlier approximation at this step, i.e., we approximate the system we think we want via the specification we can actually write down. However, the inaccuracies of this approximation are again left to different subfields concerned about the whole variety of system design.
- 6.
Refer to https://github.com/philippaltmann/EMAS for our full implementation.
- 7.
Note that, partial observability, besides being a common assumption in multi-agent reinforcement learning, has been shown to improve the agents’ generalization to shifting environments [1] and is commonly used for continuous robotic control tasks [26]. Therefore, it could be considered a generally preferred implementation in practice.
- 8.
Of the initial ten random seeds, the training only converged for 6, indicating the need to use more sophisticated training algorithms in the future.
References
Altmann, P., Ritz, F., Feuchtinger, L., Nüßlein, J., Linnhoff-Popien, C., Phan, T.: CROP: towards distributional-shift robust reinforcement learning using compact reshaped observation processing. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3414–3422 (2023)
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016)
Anderson, P.W.: More is different. Science 177(4047), 393–396 (1972)
Bresenham, J.E.: Algorithm for computer control of a digital plotter. In: Seminal graphics: pioneering efforts that shaped the field, pp. 1–6. ACM, New York (1998)
Burton, S., Herd, B.: Addressing uncertainty in the safety assurance of machine-learning. Front. Comput. Sci. 5, 1132580 (2023)
Christiano, P.F., Leike, J., Brown, T., Martic, M., Legg, S., Amodei, D.: Deep reinforcement learning from human preferences. Adv. Neural Inf. Process. Syst. 30, 4302–4310 (2017)
Felten, F., et al.: A toolkit for reliable benchmarking and research in multi-objective reinforcement learning. In: Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023) (2023)
Foerster, J.N., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp. 2974–2982 (2018)
Fromm, J.: Types and forms of emergence. arXiv preprint arXiv:nlin/0506028 (2005)
Gabriel, I.: Artificial intelligence, values, and alignment. Minds Mach. 30(3), 411–437 (2020)
Haider, T., Roscher, K., Herd, B., Schmoeller Roza, F., Burton, S.: Can you trust your agent? The effect of out-of-distribution detection on the safety of reinforcement learning systems. In: Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, pp. 1569–1578 (2024)
Haider, T., Roscher, K., Schmoeller da Roza, F., Günnemann, S.: Out-of-distribution detection for reinforcement learning agents with probabilistic dynamics models. In: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, pp. 851–859 (2023)
Hayes, C.F., et al.: A practical guide to multi-objective reinforcement learning and planning. Auton. Agent. Multi-Agent Syst. 36(1), 26 (2022)
Hayes, C.F., et al.: A brief guide to multi-objective reinforcement learning and planning. In: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, pp. 1988–1990 (2023)
Hendrycks, D., Carlini, N., Schulman, J., Steinhardt, J.: Unsolved problems in ML safety. arXiv preprint arXiv:2109.13916 (2021)
Honderich, T.: The Oxford Companion to Philosophy. OUP, Oxford (1995)
o ISO21448:2022(E). Road Vehicles Safety of the Intended Functionality. Standard, International Organization for Standardization (2022)
Krakovna, V., et al.: Specification gaming: the flip side of AI ingenuity. DeepMind Blog 3 (2020)
Lehman, J., et al.: The surprising creativity of digital evolution: a collection of anecdotes from the evolutionary computation and artificial life research communities. Artif. Life 26(2), 274–306 (2020)
Leike, J., Krueger, D., Everitt, T., Martic, M., Maini, V., Legg, S.: Scalable agent alignment via reward modeling: a research direction. arXiv preprint arxiv:1811.07871 (2018)
Leike, J., et al.: AI safety gridworlds. arXiv preprint arxiv:1711.0988 (2017)
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Machine Learning Proceedings 1994, pp. 157–163. Elsevier (1994)
Mnih, V., etal.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937. PMLR (2016)
Ng, A., Harada, D., Russell, S.: Policy invariance under reward transformations: theory and application to reward shaping. In: Proceedings of the Sixteenth International Conference on Machine Learning (ICML), pp. 278–287 (1999)
Ouyang, L., et al.: Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 35, 27730–27744 (2022)
Plappert, M., et al.: Multi-goal reinforcement learning: challenging robotics environments and request for research. arXiv preprint arXiv:1802.09464 (2018)
Puterman, M.L.: Markov decision processes. Handbooks Oper. Res. Manag. Sci. 2, 331–434 (1990)
Reymond, M., Bargiacchi, E., Nowé, A.: Pareto conditioned networks. arXiv preprint arXiv:2204.05036 (2022)
Ritz, F., et al.: Specification aware multi-agent reinforcement learning. In: Rocha, A.P., Steels, L., van den Herik, J. (eds.) ICAART 2021, pp. 3–21. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10161-8_1
Roijers, D.M., Vamplew, P., Whiteson, S., Dazeley, R.: A survey of multi-objective sequential decision-making. J. Artif. Intell. Res. 48, 67–113 (2013)
Russell, S.: Human Compatible: Artificial Intelligence and the Problem of Control. Allen Lane (2019)
Salimibeni, M., Mohammadi, A., Malekzadeh, P., Plataniotis, K.N.: Multi-agent reinforcement learning via adaptive Kalman temporal difference and successor representation. Sensors 22(4), 1393 (2022)
Sedlmeier, A., Gabor, T., Phan, T., Belzner, L., Linnhoff-Popien, C.: Uncertainty-based out-of-distribution detection in deep reinforcement learning. arXiv preprint arXiv:1901.02219 (2019)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (2018)
Tisue, S., Wilensky, U.: Netlogo: a simple environment for modeling complexity. In: International Conference on Complex Systems, vol. 21, pp. 16–21. Citeseer (2004)
Towers, M., et al.: Gymnasium (2023). https://github.com/Farama-Foundation/Gymnasium
Van Moffaert, K., Nowé, A.: Multi-objective reinforcement learning using sets of pareto dominating policies. J. Mach. Learn. Res. 15(1), 3483–3512 (2014)
Wilensky, U., Rand, W.: An Introduction to Agent-Based Modeling: Modeling Natural, Social, and Engineered Complex Systems with NetLogo. MIT Press (2015)
Wirsing, M., Hölzl, M., Tribastone, M., Zambonelli, F.: ASCENS: engineering autonomic service-component ensembles. In: International Symposium on Formal Methods for Components and Objects, pp. 1–24 (2011)
Wolpert, D.H., Tumer, K.: Optimal payoff functions for members of collectives. Adv. Complex Syst. 04(02n03), 265–279 (2001)
Acknowledgments
This work was funded by the Bavarian Ministry for Economic Affairs, Regional Development and Energy as part of a project to support the thematic development of the Institute for Cognitive Systems.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Altmann, P. et al. (2025). Emergence in Multi-agent Systems: A Safety Perspective. In: Margaria, T., Steffen, B. (eds) Leveraging Applications of Formal Methods, Verification and Validation. Rigorous Engineering of Collective Adaptive Systems. ISoLA 2024. Lecture Notes in Computer Science, vol 15220. Springer, Cham. https://doi.org/10.1007/978-3-031-75107-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-75107-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-75106-6
Online ISBN: 978-3-031-75107-3
eBook Packages: Computer ScienceComputer Science (R0)