Abstract
In this paper we present an interaction technique for coordinating agents that use rewards generated by Reinforcement Learning algorithms. Agents that coordinate with each other by exchanging rewards need mechanisms to help them while they interact to discover action policies. Because of the peculiarities of the environment and the objectives of each agent, there is no guarantee that a coordination model can converge them to an optimal policy. One possibility is to take advantage of existing models so that a mechanism that is less sensitive to the system variables emerges. The technique described here is based on three models previously studied in which agents (i) share learning in a predefined cycle of interactions, (ii) cooperate at every interaction and (iii) cooperate when an agent reaches the goal-state. Traffic scenarios were generated as a way of validating the proposed technique. The results showed that even when the computational complexity was increased the gains in terms of convergence make the technique superior to classical Reinforcement Learning approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chakraborty, D., Stone, P.: Multiagent learning in the presence of memory-bounded agents. Auton. Agent. Multi-Agent Syst. 28(2), 182–213 (2014)
Chapelle, J., Simonin, O., Ferber, J.: How situated agents can learn to cooperate by monitoring their neighbors’ satisfaction. ECAI 2, 68–78 (2002)
DeLoach, S.A., Valenzuela, J.L.: An agent-environment interaction model. In: Padgham, L., Zambonelli, F. (eds.) AOSE 2006. LNCS, vol. 4405, pp. 1–18. Springer, Heidelberg (2007). doi:10.1007/978-3-540-70945-9_1
Devlin, S., Yliniemi, L., Kudenko, D., Tumer, K.: Potential-based difference rewards for multiagent reinforcement learning. In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems, AAMAS 2014, Richland, SC, pp. 165–172 (2014)
Efthymiadis, K., Kudenko, D.: Knowledge revision for reinforcement learning with abstract MDPs. In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2015, Richland, SC, pp. 763–770 (2015)
Grzes, M., Hoey, J.: Efficient planning in R-max. In: The 10th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2011, Richland, vol. 3, pp. 963–970 (2011)
Kaelbling, L.P., Littman, M.L., Moore, A.P.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Ribeiro, R., Borges, A.P., Enembreck, F.: Interaction models for multiagent reinforcement learning. In: 2008 International Conference on Computational Intelligence for Modelling Control Automation, pp. 464–469 (2008)
Ribeiro, R., Enembreck, F.: A sociologically inspired heuristic for optimization algorithms: a case study on ant systems. Expert Syst. Appl. 40(5), 1814–1826 (2013)
Ribeiro, R., Enembreck, F., Koerich, A.L.: A hybrid learning strategy for discovery of policies of action. In: Sichman, J.S., Coelho, H., Rezende, S.O. (eds.) IBERAMIA/SBIA -2006. LNCS, vol. 4140, pp. 268–277. Springer, Heidelberg (2006). doi:10.1007/11874850_31
Ribeiro, R., Ronszcka, A.F., Barbosa, M.A.C., Enembreck, F.: Updating strategies of policies for coordinating agent swarm in dynamic environments. In: Hammoudi, S., Maciaszek, L.A., Cordeiro, J., Dietz, J.L.G. (eds.) ICEIS, vol. 1, pp. 345–356 (2013)
Saito, M., Kobayashi, I.: A study on efficient transfer learning for reinforcement learning using sparse coding. Autom. Control Eng. 4(4), 324–330 (2016)
Stone, P., Veloso, M.: Multiagent systems: a survey from a machine learning perspective. Auton. Robots 8(3), 345–383 (2000)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Tesauro, G.: Temporal difference learning and TD-Gammon. Commun. ACM 38(3), 58–68 (1995)
Walsh, T.J., Goschin, S., Littman, M.L.: Integrating sample-based planning and model-based reinforcement learning. In: Fox, M., Poole, D. (eds.) AAAI. AAAI Press (2010)
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3), 272–292 (1992)
Xinhai, X., Lunhui, X.: Traffic signal control agent interaction model based on game theory and reinforcement learning. In: International Forum on Computer Science Technology and Applications, IFCSTA 2009, vol. 1, pp. 164–168 (2009)
Xuan, P., Lesser, V.: Multi-agent policies: from centralized ones to decentralized ones. In: Proceedings of the 1st International Joint Conference on Autonomous Agents and Multiagent Systems, Part 3, pp. 1098–1105 (2002)
Zhang, C., Lesser, V.: Multi-agent learning with policy prediction. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, Atlanta, pp. 927–934 (2010)
Zhang, C., Lesser, V.: Coordinating multi-agent reinforcement learning with limited communication. In: Ito, J., Gini, S. (eds.) Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems, pp. 1101–1108 (2013)
Acknowledgment
This research was supported by the Araucária Foundation and the National Council for Scientific and Technological Development (CNPq) under grant numbers 378/2014 and 484859/2013-7, respectively.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ribeiro, R., Guisi, D.M., Teixeira, M., Dosciatti, E.R., Borges, A.P., Enembreck, F. (2017). Combination of Interaction Models for Multi-Agents Systems. In: Hammoudi, S., Maciaszek, L., Missikoff, M., Camp, O., Cordeiro, J. (eds) Enterprise Information Systems. ICEIS 2016. Lecture Notes in Business Information Processing, vol 291. Springer, Cham. https://doi.org/10.1007/978-3-319-62386-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-62386-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62385-6
Online ISBN: 978-3-319-62386-3
eBook Packages: Computer ScienceComputer Science (R0)