Abstract
Large-scale autonomic systems are required to self-optimize with respect to high-level policies, that can differ in terms of their priority, as well as their spatial and temporal scope. Decentralized multi-agent systems represent one approach to implementing the required self-optimization capabilities. However, the presence of multiple heterogeneous policies leads to heterogeneity of the agents that implement them. In this paper we evaluate the use of Reinforcement Learning techniques to support the self-optimization of heterogeneous agents towards multiple policies in decentralized systems. We evaluate these techniques in an Urban Traffic Control simulation and compare two approaches to supporting multiple policies. Our results suggest that approaches based on W-learning, which learn separately for each policy and then select between nominated actions based on current action importance, perform better than combining policies into a single learning process over a single state space. The results also indicate that explicitly supporting multiple policies simultaneously can improve waiting times over policies dedicated to optimizing for a single vehicle type.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abdulhai, B., Pringle, R., Karakoulas, G.: Reinforcement learning for the true adaptive traffic signal control. Journal of Transportation Engineering 129(3), 278–285 (2003)
Bazzan, A.L.: A distributed approach for coordination of traffic signal agents. Autonomous Agents and Multi-Agent Systems 10(1), 131–164 (2005)
Cuayáhuitl, H., Renals, S., Lemon, O., Shimodaira, H.: Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces. Int. Journal of Game Theory, 547–565 (2006)
Dowling, J.: The Decentralised Coordination of Self-Adaptive Components for Autonomic Distributed Systems. PhD thesis, Trinity College Dublin (2005)
He, L., Nort, N.: Hybrid genetic algorithms for telecommunications network back-up routeing. BT Technology Journal 18(4) (October 2000)
Humphrys, M.: Action Selection methods using Reinforcement Learning. PhD thesis, University of Cambridge (1996)
Kadrovach, B.A., Lamont, G.B.: A particle swarm model for swarm-based networked sensor systems. In: SAC, pp. 918–924 (2002)
Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36(1), 41–50 (2003)
Meignan, D., Simonin, O., Koukam, A.: Simulation and evaluation of urban bus-networks using a multiagent approach. Simulation Modelling Practice and Theory 15(6), 659–671 (2007)
Montresor, A., Meling, H., Babaoğlu, Ö.: Messor: Load-balancing through a swarm of autonomous agents. In: Moro, G., Koubarakis, M. (eds.) AP2PC 2002. LNCS (LNAI), vol. 2530, pp. 125–137. Springer, Heidelberg (2003)
Natarajan, S., Tadepalli, P.: Dynamic preferences in multi-criteria reinforcement learning. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 601–608. ACM, New York (2005)
Oliveira, E., Duarte, N.: Making way for emergency vehicles. In: Proc. of the 2005 European Simulation and Modelling Conference, pp. 128–135 (2005)
Papageorgiou, M., Diakaki, C., Dinopoulou, V.: Review of road traffic control strategies. Proc. of the IEEE 91(12) (December 2003)
Pendrith, M.D.: Distributed reinforcement learning for a traffic engineering application. In: AGENTS 2000, pp. 404–411. ACM Press, New York (2000)
Reynolds, V., Cahill, V., Senart, A.: Requirements for an ubiquitous computing simulation and emulation environment. In: InterSense 2006. ACM Press, New York (2006)
Richter, S.: Learning traffic control - towards practical traffic control using policy gradients. Technical report, Albert-Ludwigs-Universität Freiburg (2006)
Russell, S., Norvig, P.: Aritifical Intelligence - A Modern Approach. Prentice-Hall, Englewood Cliffs (2003)
Salkham, A., Cunningham, R., Garg, A., Cahill, V.: A collaborative reinforcement learning approach to urban traffic control optimization. In: International Conference on Intelligent Agent Technology (December 2008)
Shelton, C.R.: Balancing multiple sources of reward in reinforcement learning. In: Neural Information Processing Systems 2000, pp. 1082–1088 (2000)
Suton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford Book. MIT Press, Cambridge (2002)
Tesauro, G., Chess, D.M., Walsh, W.E., Das, R., Segal, A., Whalley, I., Kephart, J.O., White, S.R.: A multi-agent systems approach to autonomic computing. In: AAMAS 2004, pp. 464–471 (2004)
Wiering, M.: Multi-agent reinforcement learning for traffic light control. In: Proc. of 17th Int. Conf. on Machine Learning, pp. 1151–1158. Morgan Kaufmann, San Francisco (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dusparic, I., Cahill, V. (2009). Using Reinforcement Learning for Multi-policy Optimization in Decentralized Autonomic Systems – An Experimental Evaluation. In: González Nieto, J., Reif, W., Wang, G., Indulska, J. (eds) Autonomic and Trusted Computing. ATC 2009. Lecture Notes in Computer Science, vol 5586. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02704-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-02704-8_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02703-1
Online ISBN: 978-3-642-02704-8
eBook Packages: Computer ScienceComputer Science (R0)