skip to main content
research-article

Autonomic multi-policy optimization in pervasive systems: Overview and evaluation

Published:04 May 2012Publication History
Skip Abstract Section

Abstract

This article describes Distributed W-Learning (DWL), a reinforcement learning-based algorithm for collaborative agent-based optimization of pervasive systems. DWL supports optimization towards multiple heterogeneous policies and addresses the challenges arising from the heterogeneity of the agents that are charged with implementing them. DWL learns and exploits the dependencies between agents and between policies to improve overall system performance. Instead of always executing the locally-best action, agents learn how their actions affect their immediate neighbors and execute actions suggested by neighboring agents if their importance exceeds the local action's importance when scaled using a predefined or learned collaboration coefficient. We have evaluated DWL in a simulation of an Urban Traffic Control (UTC) system, a canonical example of the large-scale pervasive systems that we are addressing. We show that DWL outperforms widely deployed fixed-time and simple adaptive UTC controllers under a variety of traffic loads and patterns. Our results also confirm that enabling collaboration between agents is beneficial as is the ability for agents to learn the degree to which it is appropriate for them to collaborate. These results suggest that DWL is a suitable basis for optimization in other large-scale systems with similar characteristics.

References

  1. Abdulhai, B., Pringle, R., and Karakoulas, G. 2003. Reinforcement learning for the true adaptive traffic signal control. J. Trans. Engin. 129, 3, 278--285.Google ScholarGoogle ScholarCross RefCross Ref
  2. Bazzan, A. L. 2005. A distributed approach for coordination of traffic signal agents. Auton. Agents Multi-Agent Syst. 10, 1, 131--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bernstein, D. S., Zilberstein, S., and Immerman, N. 2000. The complexity of decentralized control of markov decision processes. In Mathematics of Operations Research. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cuayahuitl, H., Renals, S., Lemon, O., and Shimodaira, H. 2006. Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces. Int. J. Game Theory, 547--565.Google ScholarGoogle Scholar
  5. da Silva, B. C., Basso, E. W., Bazzan, A. L. C., and Engel, P. M. 2006. Dealing with non-stationary environments using context detection. In Proceedings of the 23rd International Conference on Machine Learning (ICML'06). ACM, New York, 217--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dowling, J. 2005. The decentralised coordination of self-adaptive components for autonomic distributed systems. Ph.D. thesis, Trinity College Dublin.Google ScholarGoogle Scholar
  7. Dowling, J., Cunningham, R., Curran, E., and Cahill, V. 2006. Building autonomic systems using collaborative reinforcement learning. Knowl. Engin. Rev. 21, 3, 231--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dusparic, I. and Cahill, V. 2009a. Distributed W-Learning: Multi-Policy optimization in self-organizing systems. In 3rd IEEE International Conference on Self-Adaptive and Self-Organizing Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dusparic, I. and Cahill, V. 2009b. Using reinforcement learning for multi-policy optimization in decentralized autonomic systems - An experimental evaluation. In Proceedings of the 6th International Conference on Autonomic and Trusted Computing, W. Reif, G. Wang, and J. Indulska, Eds. Lecture Notes in Computer Science, vol. 5586. Springer, 105--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Febbraro, A. D., Giglio, D., and Sacco, N. 2004. Urban traffic control structure based on hybrid petri nets. IEEE Trans. Intell. Trans. Syst. 5, 4, 224--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Guestrin, C., Lagoudakis, M., and Parr, R. 2002. Coordinated reinforcement learning. In Proceedings of the ICML-2002 the 19th International Conference on Machine Learning. 227--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hoar, R., Penner, J., and Jacob, C. 2002. Evolutionary swarm traffic: If ant roads had traffic lights. In (CEC'02) Proceedings of the Evolutionary Computation (CEC '02). Proceedings of the 2002 Congress. IEEE Computer Society, Washington, DC, 1910--1915. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Humphrys, M. 1996a. Action selection methods using reinforcement learning. In Proceedings of the 4th International Conference on Simulation of Adaptive Behavior. MIT Press, 135--144.Google ScholarGoogle Scholar
  14. Humphrys, M. 1996b. Action selection methods using reinforcement learning. Ph.D. thesis, University of Cambridge.Google ScholarGoogle Scholar
  15. Kalyanakrishnan, S. and Stone, P. 2007. Batch reinforcement learning in a complex domain. In Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems. ACM, New York, 650--657. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kephart, J. O. and Chess, D. M. 2003. The vision of autonomic computing. Comput. 36, 1, 41--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kok, J. R., 't Hoen, P. J., Bakker, B., and Vlassis, N. 2005. Utile coordination: Learning interdependencies among cooperative agents. In Proceedings of the IEEE Symposium on Computational Intelligence and Games (CIG). 29--36.Google ScholarGoogle Scholar
  18. Littman, M. L., Ravi, N., Fenson, E., and Howard, R. 2004. Reinforcement learning for autonomic network repair. In Proceedings of the 1st International Conference on Autonomic Computing (ICAC'04). IEEE Computer Society, Washington, DC, 284--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Melo, F. and Veloso, M. 2009. Learning of coordination: Exploiting sparse interactions in multiagent systems. In Proceedings of the 8th International Conference on Autonomous Agents and Multi-Agent Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Oliveira, E. and Duarte, N. 2005. Making way for emergency vehicles. In Proceedings of the European Simulation and Modelling Conference. 128--135.Google ScholarGoogle Scholar
  21. Perez, J., Germain-Renaud, C., Kegl, B., and Loomis, C. 2008. Grid differentiated services: A reinforcement learning approach. In Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGRID '08). IEEE Computer Society, Washington, DC, 287--294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Prothmann, H., Rochner, F., Tomforde, S., Branke, J., Müller-Schloer, C., and Schmeck, H. 2008. Organic control of traffic lights. In Proceedings of the 5th International Conference on Autonomic and Trusted Computing (ATC '08). Springer, 219--233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Reynolds, V., Cahill, V., and Senart, A. 2006. Requirements for an ubiquitous computing simulation and emulation environment. In Proceedings of the InterSense '06 Conference. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Richter, S. 2006. Learning traffic control - Towards practical traffic control using policy gradients. Tech. rep., Albert-Ludwigs-Universitat Freiburg.Google ScholarGoogle Scholar
  25. Richter, S., Aberdeen, D., and Yu, J. 2007. Natural actor-critic for road traffic optimisation. Adv. Neural Inf. Process. Syst. 19. The MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  26. Salkham, A. and Cahill, V. 2010. Soilse: A decentralized approach to optimization of fluctuating urban traffic using reinforcement learning. In 13th International IEEE Conference on Intelligent Transportation System (ITSC '10).Google ScholarGoogle Scholar
  27. Salkham, A., Cunningham, R., Garg, A., and Cahill, V. 2008. A collaborative reinforcement learning approach to urban traffic control optimization. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). Vol. 2. 560--566. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Schneider, J., Wong, W.-K., Moore, A., and Riedmiller, M. 1999. Distributed value functions. In Proceedings of the 16th International Conference on Machine Learning. Morgan Kaufmann, 371--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Suton, R. S. and Barto, A. G. 1998. Reinforcement Learning: An Introduction. A Bradford Book. The MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sycara, K. 1998. Multiagent systems. AI Mag. 19, 2.Google ScholarGoogle Scholar
  31. Tan, M. 1993. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the 10th International Conference on Machine Learning. Morgan Kaufmann, 330--337.Google ScholarGoogle ScholarCross RefCross Ref
  32. Tesauro, G. 2007. Reinforcement learning in autonomic computing: A manifesto and case studies. IEEE Internet Comput. 11, 1, 22--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Tesauro, G., Chess, D. M., Walsh, W. E., Das, R., Segal, A., Whalley, I., Kephart, J. O., and White, S. R. 2004. A multi-agent systems approach to autonomic computing. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems. 464--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tesauro, G., Das, R., Walsh, W. E., and Kephart, J. O. 2005. Utility-Function-Driven resource allocation in autonomic systems. In Proceedings of the International Conference on Autonomic Computing. 342--343. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Tesauro, G., Jong, N. K., Das, R., and Bennani, M. N. 2006. A hybrid reinforcement learning approach to autonomic resource allocation. In Proceedings of the IEEE International Conference on Autonomic Computing (ICAC '06). IEEE Computer Society, Washington, DC, 65--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Watkins, C. J. C. H. and Dayan, P. 1992. Technical note: Q-learning. Mach. Learn. 8, 3, 279--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wiering, M., van Veenen, J., Vreeken, J., and Koopman, A. 2004. Intelligent traffic light control. Tech. rep., Institute of Information and Computing Sciences, Utrecht University.Google ScholarGoogle Scholar
  38. Yang, Z., Chen, X., Tang, Y., and Sun, J. 2005. Intelligent cooperation control of urban traffic networks. In Proceedings of the International Conference on Machine Learning and Cybernetics. 1482--1486.Google ScholarGoogle Scholar

Index Terms

  1. Autonomic multi-policy optimization in pervasive systems: Overview and evaluation

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Autonomous and Adaptive Systems
            ACM Transactions on Autonomous and Adaptive Systems  Volume 7, Issue 1
            Special section on formal methods in pervasive computing, pervasive adaptation, and self-adaptive systems: Models and algorithms
            April 2012
            365 pages
            ISSN:1556-4665
            EISSN:1556-4703
            DOI:10.1145/2168260
            Issue’s Table of Contents

            Copyright © 2012 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 4 May 2012
            • Accepted: 1 May 2011
            • Revised: 1 November 2010
            • Received: 1 January 2010
            Published in taas Volume 7, Issue 1

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader