Using Reinforcement Learning for Multi-policy Optimization in Decentralized Autonomic Systems – An Experimental Evaluation

Dusparic, Ivana; Cahill, Vinny

doi:10.1007/978-3-642-02704-8_9

Ivana Dusparic²⁰ &
Vinny Cahill²⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5586))

Included in the following conference series:

International Conference on Autonomic and Trusted Computing

572 Accesses

Abstract

Large-scale autonomic systems are required to self-optimize with respect to high-level policies, that can differ in terms of their priority, as well as their spatial and temporal scope. Decentralized multi-agent systems represent one approach to implementing the required self-optimization capabilities. However, the presence of multiple heterogeneous policies leads to heterogeneity of the agents that implement them. In this paper we evaluate the use of Reinforcement Learning techniques to support the self-optimization of heterogeneous agents towards multiple policies in decentralized systems. We evaluate these techniques in an Urban Traffic Control simulation and compare two approaches to supporting multiple policies. Our results suggest that approaches based on W-learning, which learn separately for each policy and then select between nominated actions based on current action importance, perform better than combining policies into a single learning process over a single state space. The results also indicate that explicitly supporting multiple policies simultaneously can improve waiting times over policies dedicated to optimizing for a single vehicle type.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control

An Autonomic Methodology for Embedding Self-tuning Competence in Online Traffic Control Systems

Hierarchical multiagent reinforcement learning schemes for air traffic management

Article 10 February 2021

References

Abdulhai, B., Pringle, R., Karakoulas, G.: Reinforcement learning for the true adaptive traffic signal control. Journal of Transportation Engineering 129(3), 278–285 (2003)
Article Google Scholar
Bazzan, A.L.: A distributed approach for coordination of traffic signal agents. Autonomous Agents and Multi-Agent Systems 10(1), 131–164 (2005)
Article Google Scholar
Cuayáhuitl, H., Renals, S., Lemon, O., Shimodaira, H.: Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces. Int. Journal of Game Theory, 547–565 (2006)
Google Scholar
Dowling, J.: The Decentralised Coordination of Self-Adaptive Components for Autonomic Distributed Systems. PhD thesis, Trinity College Dublin (2005)
Google Scholar
He, L., Nort, N.: Hybrid genetic algorithms for telecommunications network back-up routeing. BT Technology Journal 18(4) (October 2000)
Google Scholar
Humphrys, M.: Action Selection methods using Reinforcement Learning. PhD thesis, University of Cambridge (1996)
Google Scholar
Kadrovach, B.A., Lamont, G.B.: A particle swarm model for swarm-based networked sensor systems. In: SAC, pp. 918–924 (2002)
Google Scholar
Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36(1), 41–50 (2003)
Article MathSciNet Google Scholar
Meignan, D., Simonin, O., Koukam, A.: Simulation and evaluation of urban bus-networks using a multiagent approach. Simulation Modelling Practice and Theory 15(6), 659–671 (2007)
Article Google Scholar
Montresor, A., Meling, H., Babaoğlu, Ö.: Messor: Load-balancing through a swarm of autonomous agents. In: Moro, G., Koubarakis, M. (eds.) AP2PC 2002. LNCS (LNAI), vol. 2530, pp. 125–137. Springer, Heidelberg (2003)
Chapter Google Scholar
Natarajan, S., Tadepalli, P.: Dynamic preferences in multi-criteria reinforcement learning. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 601–608. ACM, New York (2005)
Google Scholar
Oliveira, E., Duarte, N.: Making way for emergency vehicles. In: Proc. of the 2005 European Simulation and Modelling Conference, pp. 128–135 (2005)
Google Scholar
Papageorgiou, M., Diakaki, C., Dinopoulou, V.: Review of road traffic control strategies. Proc. of the IEEE 91(12) (December 2003)
Google Scholar
Pendrith, M.D.: Distributed reinforcement learning for a traffic engineering application. In: AGENTS 2000, pp. 404–411. ACM Press, New York (2000)
Google Scholar
Reynolds, V., Cahill, V., Senart, A.: Requirements for an ubiquitous computing simulation and emulation environment. In: InterSense 2006. ACM Press, New York (2006)
Google Scholar
Richter, S.: Learning traffic control - towards practical traffic control using policy gradients. Technical report, Albert-Ludwigs-Universität Freiburg (2006)
Google Scholar
Russell, S., Norvig, P.: Aritifical Intelligence - A Modern Approach. Prentice-Hall, Englewood Cliffs (2003)
MATH Google Scholar
Salkham, A., Cunningham, R., Garg, A., Cahill, V.: A collaborative reinforcement learning approach to urban traffic control optimization. In: International Conference on Intelligent Agent Technology (December 2008)
Google Scholar
Shelton, C.R.: Balancing multiple sources of reward in reinforcement learning. In: Neural Information Processing Systems 2000, pp. 1082–1088 (2000)
Google Scholar
Suton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford Book. MIT Press, Cambridge (2002)
Google Scholar
Tesauro, G., Chess, D.M., Walsh, W.E., Das, R., Segal, A., Whalley, I., Kephart, J.O., White, S.R.: A multi-agent systems approach to autonomic computing. In: AAMAS 2004, pp. 464–471 (2004)
Google Scholar
Wiering, M.: Multi-agent reinforcement learning for traffic light control. In: Proc. of 17th Int. Conf. on Machine Learning, pp. 1151–1158. Morgan Kaufmann, San Francisco (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Lero – The Irish Software Engineering Research Centre Distributed Systems Group School of Computer Science and Statistics, Trinity College Dublin, Ireland
Ivana Dusparic & Vinny Cahill

Authors

Ivana Dusparic
View author publications
You can also search for this author in PubMed Google Scholar
Vinny Cahill
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Information Security Institute, Queensland University of Technology, GPO Box 2434, QLD 4001, Brisbane, Australia
Juan González Nieto
Department of Software Engineering and Programming Languages, Institute of ComputerScience, University of Augsburg, 86135, Augsburg, Germany
Wolfgang Reif
School of Information Science and Engineering, Central South University, 410083, Changsha, Hunan Province, P. R. China
Guojun Wang
School of Information Technology and Electrical Engineering, The University of Queensland, QLD 4072, Brisbane, Australia
Jadwiga Indulska

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dusparic, I., Cahill, V. (2009). Using Reinforcement Learning for Multi-policy Optimization in Decentralized Autonomic Systems – An Experimental Evaluation. In: González Nieto, J., Reif, W., Wang, G., Indulska, J. (eds) Autonomic and Trusted Computing. ATC 2009. Lecture Notes in Computer Science, vol 5586. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02704-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-02704-8_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02703-1
Online ISBN: 978-3-642-02704-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics