A layered approach to learning coordination knowledge in multiagent environments

Erus, Guray; Polat, Faruk

doi:10.1007/s10489-006-0034-y

A layered approach to learning coordination knowledge in multiagent environments

Published: 20 January 2007

Volume 27, pages 249–267, (2007)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Guray Erus¹ &
Faruk Polat²

149 Accesses
14 Citations
Explore all metrics

Abstract

Multiagent learning involves acquisition of cooperative behavior among intelligent agents in order to satisfy the joint goals. Reinforcement Learning (RL) is a promising unsupervised machine learning technique inspired from the earlier studies in animal learning. In this paper, we propose a new RL technique called the Two Level Reinforcement Learning with Communication (2LRL) method to provide cooperative action selection in a multiagent environment. In 2LRL, learning takes place in two hierarchical levels; in the first level agents learn to select their target and then they select the action directed to their target in the second level. The agents communicate their perception to their neighbors and use the communication information in their decision-making. We applied 2LRL method in a hunter-prey environment and observed a satisfactory cooperative behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Social or Individual Learning? An Aggregated Solution for Coordination in Multiagent Systems

Article 20 March 2018

Evaluating the Coordination of Agents in Multi-agent Reinforcement Learning

Cooperative Multi-agent Learning in a Large Dynamic Environment

References

Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press
Stone P, Veloso M (1997) Multiagent systems: a survey from a machine learning perspective. Tech Rep, Mellon University
Sen S, Sekaran M, Hale J (1994) Learning to coordinate without sharing information. In: Proceedings of the 12th national conference on artificial intelligence, pp 426–431
Tan M (1993) Multi-agent reinforcement learning: independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning, pp 330–337
Kuter U, Polat F (2000) Learning better in dynamic, partially observable environment. In: Lindemann G (eds) Proc. of European conf. on artificial intelligence (ECAI) workshop on modeling artificial societies and hybrid organization, Berlin, pp 50–68
Polat F, Abul O (2002) Learning sequences of compatible actions among agents. Artif Intell Rev 17(1): 21–37
Article MATH Google Scholar
Weiss G (1993) Learning to coordinate actions in multiagent systems. In: Proceedings of the 13th international joint conference on artificial intelligence, pp 311–316
Korf RE (1992) A simple solution to pursuit games. In: Working papers of the 11th international workshop on distributed artificial intelligence, pp 183–194
Durfee E, Vidal MJ (1995) Recursive agent modeling using limited rationality. In: Proceedings of the first international conference on multiagent systems (ICMAS’95), pp 376–383
Haynes T, Sen S (1996) Evolving behavioral strategies in predators and prey. In: Weiss G, Sen S (eds) Adaptation and learning in multiagent systems, Springer Verlag, Berlin, pp 113–126
Google Scholar
Tusscher KHWJ, Hagen SHG, Wiering MA (2000) The influence of communication on the choice to behave cooperatively. In: Proc. of 10th Belgian-Dutch conference on machine learning (BENELEARN’2000)
Senkul S, Polat F (2002) Learning intelligent behavior in a non-stationary and partially observable environment. Artif Intell Rev 18(2):97–115
Article Google Scholar
Abul O, Polat F (2000) Multi-agent reinforcement learning using function approximation. IEEE Trans Syst, Man and Cybern, Part C, 30(4):485–497
Article Google Scholar
Park M, Choi J (2002) New reinforcement learning method using multiple q-tables. In: world multiconference on systemics, cybernetics and informatics, pp 88–92
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Google Scholar
Bertsekas DP (1987) Dynamic programming: deterministic and stochastic models. Prentice-Hall, Englewood Cliffs, NJ
MATH Google Scholar
Bertsekas DP (1995) Dynamic programming and optimal control. Athena scientific, Belmont, MA
MATH Google Scholar
MacKay DJC (1999) Introduction to monte carlo methods. In: Jordan M (eds) Learning in graphical models. MIT Press, Cambridge, MA: pp 175–204
Google Scholar
Watkins CJCH (1989) Learning from delayed rewards. Ph.D. dissertation, Cambridge University
Ono N, Fukomoto K (1997) A modular approach to multiagent reinforcement learning. In: Weiss G (eds) Distributed artificial intelligence meets machine learning–learning in multiagent systems, vol. 1221, Springer-Verlag, Berlin, Germany, pp 25–39
Google Scholar
Park K, Kim YJ, Kim JH (2001) Modular Q-learning based multiagent cooperation for robot soccer. Robot Auton Syst 35:109–122
Article MATH Google Scholar
Kaya M, Alhajj R (2004) Modular fuzzy-reinforcement learning approach with internal model capabilities for multiagent systems. IEEE Trans. on Syst, Man, and Cybern, Part B 34(2): 1210–1223
Google Scholar
Fudenberg D, Levine D (1998) The theory of learning in games. MIT Press, Cambridge, MA
MATH Google Scholar
Littman ML (1994) Markov games as a framework for multiagent learning. In: Proceedings of the international conference on machine learning. San francisco, CA, pp 157–163
Szepesvari C, Littman ML (1999) A unified analysis of value-function-based reinforcement learning algorithms. Neur Comput 8:2017–2059
Article Google Scholar
Hu J, Wellman MP (2001) Learning about other agents in a dynamic multiagent system. J Cognit Syst Res 2:67–79
Article Google Scholar
Claus C, Boutilier C (1998) The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the national conference on artificial intelligence, pp 746–752
Hu J, Wellman M (2003) Nash Q-learning for general-sum stochastic games. J Mach Learn Res 4:1039–1069
Article MathSciNet Google Scholar
Littman ML (2001) Friend-or-foe: Q-Learning in general-sum games. In: Proceedings of the international conference on machine learning, pp 322–328
Shoham Y, Powers R, Grenager T (2003) Multiagent reinforcement learning: a critical survey. Computer Science Department, Tech Rep Stanford University
Hu J (2003) Best-response algorithm for multiagent reinforcement learning. In: Proceedings of the international conference on machine learning
Weinberg M, Rosenschein JS (2004) Best-response multiagent learning in non-stationary environments. In: Proceedings of international joint conference on autonomous agents and multiagent systems, pp 506–513
Bowling M, Veloso M (2001) Multiagent learning using a variable learning rate. Artif Intell 136:215–250
Article MathSciNet Google Scholar
Strens M (2000) A bayesian framework for reinforcement learning. In: Proceedings of international conference on machine learning, Stanford University, CA
Chalkiadakis G, Boutilier C (2003) Coordination in multiagent reinforcement learning: a bayesian approach. In: Proceedings of international joint conference on autonomous agents and multiagent systems, Melbourne, Australia, pp 709–716
Barto A, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning. Discr Event Dynam Syst 13(4):341–379
Article MathSciNet Google Scholar
Marthi B, Russell S, Latham D, Guestrin C (2005) Concurrent hierarchical reinforcement learning. In: The twentieth international joint conference on artificial intelligence, IJ CAI, (accepted for presentation), Edinburgh, Scotland
Sutton R, Precup D, Singh S (1999) Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artif intell 112(1–2):181–211
Article MATH MathSciNet Google Scholar
Parr R, Russell S (1998) Reinforcement learning with hierarchies of machines. In: Advances in neural information processing systems: Proc. of the 1997 conference. MIT Press, Cambridge, MA
Dietterich T (2000) Hierarchical reinforcement learning with the maxq value function decomposition. J Artif Intell Res 9:227–303
MathSciNet Google Scholar
Menache I, Mannor S, Shimkin N (2002) Q-cut dynamic discovery of sub-goals in reinforcement learning. In: Proc. of European conference on machine learning, ECML’02, Springer-Verlag, London, UK, pp 295–306
Stolle M, Precup D (2002) Learning options in reinforcement learning. In Proc. of the int’l symposium on abstarction, reformulation and approximation. Springer-Verlag, London, UK, pp 212–223
Simsek O, Barto A (2004) Using relative novelty to identify useful temporal abstractions in reinforcement learning. In: Proc. of int’l conference on machine learning, ICML’04, Banff, Canada
Lin L (1992) Self-improving reactive agents based on in reinforcement learning, planning and teaching. Mach Learn 8(3–4):293–321
Google Scholar
Picklett M, Barto A (2002) An algorithm for creating useful macro-actions in reinforcement learning. In: Proc. of int’l conference on machine learning, ICML’02
Girgin S, Polat F, Alhajj R (2006) Learning by automatic option discovery from conditionally terminating sequences. In: Proc. of the 17th European conference on artificial intelligence (ECAI), Riva del garda, Italy
Girgin S, Polat F (2005) Option discovery in reinforcement learning using frequent subsequences of actions. In: Proc. of international conference on intelligent agents web technologies and internet commerce, IAWTIC. IEEE, Vienna, Austria
Girgin S, Polat F, Alhajj R (2007) State similarity based approach for improving performance in rl. In: The twentieth international joint conference on artificial intelligence IJCAI, (Accepted for presentation), Hyderabad, India
Ghavamzadeh M, Mahadevan S, Makar R (2006) Hierarchical multiagent reinforcement learning. J Auton Agents Multiagent Syst 13(2):197–229, DOI: 10.1007/s10458-006-7035-4
Google Scholar
Rovatsos M, Fischer F, Weiss G (2004) Hierarchical reinforcement learning for communicating agents. In: Hoe WVD (eds) Proceedings of the 2nd European Workshop on Multiagent Systems (EUMAS), pp 593–604

Download references

Author information

Authors and Affiliations

Universite de Paris 5, Laboratoire SIP-CRIP5, 45 rue des Saints Peres, 75006, Paris, France
Guray Erus
Middle East Technical University, Ankara, Turkey
Faruk Polat

Authors

Guray Erus
View author publications
You can also search for this author in PubMed Google Scholar
Faruk Polat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Faruk Polat.

Additional information

Guray Erus received the B.S. degree in computer engineering in 1999, and the M.S. degree in cognitive sciences, in 2002, from Middle East Technical University (METU), Ankara, Turkey. He is currently a teaching and research assistant in Rene“ Descartes University, Paris, France, where he prepares a doctoral dissertation on object detection on satellite images, as a member of the intelligent perception systems group (SIP-CRIP5). His research interests include multi-agent systems and image understanding.

Faruk Polat is a professor in the Department of Computer Engineering of Middle East Technical University, Ankara, Turkey. He received his B.Sc. in computer engineering from the Middle East Technical University, Ankara, in 1987 and his M.S. and Ph.D. degrees in computer engineering from Bilkent University, Ankara, in 1989 and 1993, respectively. He conducted research as a visiting NATO science scholar at Computer Science Department of University of Minnesota, Minneapolis in 1992–93. His research interests include artificial intelligence, multi-agent systems and object oriented data models.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Erus, G., Polat, F. A layered approach to learning coordination knowledge in multiagent environments. Appl Intell 27, 249–267 (2007). https://doi.org/10.1007/s10489-006-0034-y

Download citation

Received: 20 September 2006
Accepted: 06 December 2006
Published: 20 January 2007
Issue Date: December 2007
DOI: https://doi.org/10.1007/s10489-006-0034-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A layered approach to learning coordination knowledge in multiagent environments

Abstract

Access this article

Similar content being viewed by others

Social or Individual Learning? An Aggregated Solution for Coordination in Multiagent Systems

Evaluating the Coordination of Agents in Multi-agent Reinforcement Learning

Cooperative Multi-agent Learning in a Large Dynamic Environment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A layered approach to learning coordination knowledge in multiagent environments

Abstract

Access this article

Similar content being viewed by others

Social or Individual Learning? An Aggregated Solution for Coordination in Multiagent Systems

Evaluating the Coordination of Agents in Multi-agent Reinforcement Learning

Cooperative Multi-agent Learning in a Large Dynamic Environment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation