Multi-policy Optimization in Self-organizing Systems

Dusparic, Ivana; Cahill, Vinny

doi:10.1007/978-3-642-14412-7_6

Ivana Dusparic²⁰ &
Vinny Cahill²⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 6090))

Included in the following conference series:

International Workshop on Self-Organizing Architectures

716 Accesses
1 Citations

Abstract

Self-organizing systems are often implemented as collections of collaborating agents. Such agents may need to optimize their own performance according to multiple policies as well as contribute to the optimization of overall system performance towards a potentially different set of policies. These policies can be heterogeneous, i.e., be implemented on different sets of agents, be active at different times and have different levels of priority, leading to the heterogeneity of the agents of which the system is composed. Numerous biologically-inspired techniques as well as techniques from artificial intelligence have been used to implement such self-organizing systems. In this paper we review the most commonly used techniques for multi-policy optimization in such systems, specifically, those based on ant colony optimization, evolutionary algorithms, particle swarm optimization and reinforcement learning (RL). We analyze the characteristics and existing applications of the reviewed algorithms, assessing their suitability for particular types of optimization problems, based on the environment and policy characteristics. We focus on RL, as it is considered particularly suitable for large-scale self-organizing systems due to its ability to take into account the long-term consequences of the actions executed. Therefore, RL enables the system to learn not only the immediate payoffs of its actions, but also the best actions for the long-term performance of the system. Existing RL implementations mostly focus on optimization towards a single system policy, while most multi-policy RL-based optimization techniques have so far been implemented only on a single agent. We argue that, in order to be more widely utilized as a technique for self-optimization, RL needs to address both multiple policies and multiple agents simultaneously, and analyze the challenges associated with extending existing or developing new RL optimization techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Angus, D., Woodward, C.: Multiple objective ant colony optimisation. Swarm Intelligence 3(1), 69–85 (2009)
Article Google Scholar
Babaoglu, O., Meling, H., Montresor, A.: Anthill: A framework for the development of agent-based peer-to-peer systems. In: International Conference on Distributed Computing Systems (2002)
Google Scholar
Baird, L., Moore, A.: Gradient descent for general reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 11, pp. 968–974. MIT Press, Cambridge (1999)
Google Scholar
Baran, B., Schaerer, M.: A multiobjective ant colony system for vehicle routing problem with time windows. In: Proceedings of IASTED International Conference on Applied Informatics (2003)
Google Scholar
Barrett, L., Narayanan, S.: Learning all optimal policies with multiple criteria. In: ICML 2008: Proceedings of the 25th International Conference on Machine Learning, pp. 41–47 (2008)
Google Scholar
Bigus, J.P., Schlosnagle, D.A., Pilgrim, J.R., Nathaniel Mills III, W., Diao, Y.: Able: A toolkit for building multiagent autonomic systems. IBM Systems Journal 41(3), 350–371 (2002)
Article Google Scholar
Blum, C., Merkle, D. (eds.): Swarm Intelligence: Introduction and Applications. Natural Computing Series. Springer, Heidelberg (2008)
MATH Google Scholar
Brooks, R.: Achieving artificial intelligence through building robots. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA (1986)
Google Scholar
Brooks, R.A.: How to build complete creatures rather than isolated cognitive simulators. In: Architectures for Intelligence, pp. 225–239. Erlbaum, Mahwah (1991)
Google Scholar
Busoniu, L., Schutter, B.D., Babuska, R.: Learning and coordination in dynamic multiagent systems. Technical Report 05-019, Delft Center for Systems and Control, Delft University of Technology, Delft, The Netherlands (October 2005)
Google Scholar
Cantu Paz, E., Kamath, C.: An empirical comparison of combinations of evolutionary algorithms and neural networks for classification problems 35(5), 915–927 (October 2005)
Google Scholar
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence, pp. 746–752. AAAI Press, Menlo Park (1998)
Google Scholar
Coello, C.A.C.: A comprehensive survey of evolutionary-based multiobjective optimization techniques. Knowledge and Information Systems 1, 269–308 (1999)
Google Scholar
Cuayahuitl, H., Renals, S., Lemon, O., Shimodaira, H.: Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces. International Journal of Game Theory, 547–565 (2006)
Google Scholar
Cui, X., Potok, T., Palathingal, P.: Document clustering using particle swarm optimization. In: Swarm Intelligence Symposium (2005)
Google Scholar
Di Caro, G., Dorigo, M.: AntNet: Distributed Stigmergetic Control for Communication Networks. Journal of Artificial Intelligence Research 9, 317–365 (1998)
MATH Google Scholar
Di Caro, G., Ducatelle, F., Gambardella, L.M.: AntHocNet: An adaptive nature-inspired algorithm for routing in mobile ad hoc networks. European Transactions on Telecommunications, Special Issue on Self-organization in Mobile Networking 16, 443–455 (2005)
Google Scholar
Di Marzo Serugendo, G., Gleizes, M.-P., Karageorgos, A.: Self-organization in multi-agent systems. Knowl. Eng. Rev. 20(2), 165–189 (2005)
Article Google Scholar
Doerner, K., Hartl, R., Reimann, M.: Are COMPETants more competent for problem solving? - the case of full truckload transportation. Central European Journal of Operations Research 11(2), 115–141 (2003)
MATH MathSciNet Google Scholar
Dorigo, M., Di Caro, G.D.: The Ant Colony Optimization Meta-Heuristic, pp. 11–32. McGraw-Hill, London (1999)
Google Scholar
Dowling, J.: The Decentralised Coordination of Self-Adaptive Components for Autonomic Distributed Systems. PhD thesis, Trinity College Dublin (2005)
Google Scholar
Dowling, J., Cunningham, R., Curran, E., Cahill, V.: Building autonomic systems using collaborative reinforcement learning. Knowledge Engineering Review 21(3), 231–238 (2006)
Article Google Scholar
Dowling, J., Haridi, S.: Decentralized Reinforcement Learning for the Online Optimization of Distributed Systems. In: Reinforcement Learning. I-Tech Education and Publishing (2008)
Google Scholar
Dusparic, I., Cahill, V.: Distributed W-Learning: Multi-policy optimization in self-organizing systems. In: Third IEEE International Conference on Self-Adaptive and Self-Organizing Systems (2009)
Google Scholar
Eiben, A., Smith, J.: Introduction to Evolutionary Computing. Natural Computing Series. Springer, Heidelberg (2003)
MATH Google Scholar
Eiben, A.E.: Evolutionary computing and autonomic computing: Shared problems, shared solutions? In: Babaoğlu, Ö., Jelasity, M., Montresor, A., Fetzer, C., Leonardi, S., van Moorsel, A., van Steen, M. (eds.) SELF-STAR 2004. LNCS, vol. 3460, pp. 36–48. Springer, Heidelberg (2005)
Chapter Google Scholar
Gábor, Z., Kalmár, Z., Szepesvári, C.: Multi-criteria reinforcement learning. In: ICML 1998: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 197–205. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Google Scholar
Gadanho, S.C., Hallam, J.: Robot learning driven by emotions. Adaptive Behaviour 9(1), 42–64 (2001)
Article Google Scholar
Gambardella, L.M., Taillard, E., Agazzi, G.: MACS-VRPTW: a multiple ant colony system for vehicle routing problems with time windows, pp. 63–76 (1999)
Google Scholar
Garcia-Martinez, C., Cordon, O., Herrera, F.: A taxonomy and an empirical analysis of multiple objective ant colony optimization algorithms for the bi-criteria tsp. European Journal of Operational Research 180(1), 116–148 (2007)
Article MATH Google Scholar
Goldman, C.V., Zilberstein, S.: Optimizing information exchange in cooperative multi-agent systems. In: AAMAS 2003: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 137–144. ACM, New York (2003)
Chapter Google Scholar
Goldman, C.V., Zilberstein, S.: Decentralized control of cooperative systems: Categorization and complexity analysis. Journal of Artificial Intelligence Research (JAIR) 22, 143–174 (2004)
MATH MathSciNet Google Scholar
Guestrin, C., Koller, D., Parr, R.: Multiagent planning with factored MDPs. In: 14th Neural Information Processing Systems (NIPS-14), Vancouver, Canada, pp. 1523–1530 (December 2001)
Google Scholar
Guestrin, C., Lagoudakis, M., Parr, R.: Coordinated reinforcement learning. In: Proceedings of the ICML 2002 The Nineteenth International Conference on Machine Learning, pp. 227–234 (2002)
Google Scholar
Hiraoka, K., Yoshida, M., Mishima, T.: Parallel reinforcement learning for weighted multi-criteria model with adaptive margin. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part I. LNCS, vol. 4984, pp. 487–496. Springer, Heidelberg (2008)
Chapter Google Scholar
Hoar, R., Penner, J., Jacob, C.: Evolutionary swarm traffic: if ant roads had traffic lights. In: CEC 2002: Proceedings of the Evolutionary Computation on 2002, CEC 2002. Proceedings of the 2002 Congress, Washington, DC, USA, pp. 1910–1915. IEEE Computer Society, Los Alamitos (2002)
Google Scholar
Humphrys, M.: Action Selection methods using Reinforcement Learning. PhD thesis, University of Cambridge (1996)
Google Scholar
Kadrovach, B.A., Lamont, G.B.: A particle swarm model for swarm-based networked sensor systems. In: SAC 2002: Proceedings of the 2002 ACM Symposium on Applied Computing, pp. 918–924. ACM, New York (2002)
Chapter Google Scholar
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 99–134 (1995)
Article MathSciNet Google Scholar
Kalyanmoy, D.: Multi-Objective Optimization Using Evolutionary Algorithms. Wiley, Chichester (2001)
MATH Google Scholar
Karlsson, J.: Learning to solve multiple goals. PhD thesis, Rochester, NY, USA (1997)
Google Scholar
Kennedy, J., Russell, E.C.: Swarm Intelligence, The Morgan Kaufmann Series in Artificial Intelligence. Morgan Kaufmann, San Francisco (March 2001)
Google Scholar
Kephart, J.O., Walsh, W.E.: An artificial intelligence perspective on autonomic computing policies. In: IEEE International Workshop on Policies for Distributed Systems and Networks (2004)
Google Scholar
Kok, J.R., ’t Hoen, P.J., Bakker, B., Vlassis, N.: Utile coordination: learning interdependencies among cooperative agents. In: Proceedings of the IEEE Symposium on Computational Intelligence and Games (CIG), Colchester, United Kingdom, pp. 29–36 (April 2005)
Google Scholar
Kok, J.R., Vlassis, N.: Collaborative multiagent reinforcement learning by payoff propagation. Journal of Machine Learning Research 7, 1789–1828 (2006)
MathSciNet Google Scholar
Lekavy, M.: Optimising Multi-agent Cooperation using Evolutionary Algorithm. In: Bielikova, M. (ed.) Proceedings of IIT.SRC 2005: Student Research Conference in Informatics and Information Technologies, Bratislava, pp. 49–56. Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava (April 2005)
Google Scholar
Littman, M.L., Ravi, N., Fenson, E., Howard, R.: Reinforcement learning for autonomic network repair. In: ICAC 2004: Proceedings of the First International Conference on Autonomic Computing, Washington, DC, USA, pp. 284–285. IEEE Computer Society, Los Alamitos (2004)
Chapter Google Scholar
Maniezzo, V., Gambardella, L.M., Luigi, F.D.: Ant Colony Optimization. In: New Optimization Techniques in Engineering. Springer, Heidelberg (2004)
Google Scholar
Mariano, C., Morales, E.F.: A new distributed reinforcement learning algorithm for multiple objective optimization problems. In: Monard, M.C., Sichman, J.S. (eds.) SBIA 2000 and IBERAMIA 2000. LNCS (LNAI), vol. 1952, pp. 290–299. Springer, Heidelberg (2000)
Chapter Google Scholar
Melo, F., Veloso, M.: Learning of coordination: Exploiting sparse interactions in multiagent systems. In: Proceedings of the 8th International Conference on Autonomous Agents and Multi-Agent Systems (2009)
Google Scholar
Mikami, S., Kakazu, Y.: Genetic reinforcement learning for cooperative traffic signal control. In: International Conference on Evolutionary Computation, pp. 223–228 (1994)
Google Scholar
Montresor, A., Meling, H., Babaoglu, O.: Messor: Load-balancing through a swarm of autonomous agents. Technical Report UBLCS-02-08, Departement of Computer Science, University of Bologna, Bologna, Italy (May 2002)
Google Scholar
Natarajan, S., Tadepalli, P.: Dynamic preferences in multi-criteria reinforcement learning. In: ICML 2005: Proceedings of the 22nd International Conference on Machine Learning, pp. 601–608. ACM, New York (2005)
Chapter Google Scholar
Oxford. The Oxford English Dictionary. Oxford University Press (2000)
Google Scholar
Paquet, S., Bernier, N., Chaib-draa, B.: Multi-attribute decision making in a complex multiagent environment using reinforcement learning with selective perception. In: Tawfik, A.Y., Goodwin, S.D. (eds.) Canadian AI 2004. LNCS (LNAI), vol. 3060, pp. 416–421. Springer, Heidelberg (2004)
Google Scholar
Parsopoulos, K.E., Vrahatis, M.N.: Particle swarm optimization method in multiobjective problems. In: SAC 2002: Proceedings of the 2002 ACM Symposium on Applied Computing, pp. 603–607. ACM, New York (2002)
Chapter Google Scholar
Perez, J., Germain-Renaud, C., Kegl, B., Loomis, C.: Grid differentiated services: A reinforcement learning approach. In: CCGRID 2008: Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid, Washington, DC, USA, pp. 287–294. IEEE Computer Society, Los Alamitos (2008)
Chapter Google Scholar
Peshkin, L., Eung Kim, K., Meuleau, N., Kaelbling, L.P.: Learning to cooperate via policy search. In: Proceedings of the 16th Annual Conference on Uncertainty in Artificial Intelligence (UAI 2000), pp. 489–496. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Pugh, J., Zhang, Y., Martinoli, A.: Particle swarm optimization for unsupervised robotic learning. In: Swarm Intelligence Symposium, pp. 92–99 (2005)
Google Scholar
Raicevic, P.: Parallel reinforcement learning using multiple reward signals. Neurocomputing 69(16-18), 2171–2179 (2006)
Article Google Scholar
Ramdane-Cherif, A.: Toward autonomic computing: Adaptive neural network for trajectory planning. International Journal of Cognitive Informatics and Natural Intelligence 1(2), 16–33 (2007)
Google Scholar
Reyes-Sierra, M., Coello, C.A.C.: Multi-objective particle swarm optimizers: A survey of the state-of-the-art. International Journal of Computational Intelligence Research 2(3), 287–308 (2006)
MathSciNet Google Scholar
Richter, S.: Learning traffic control - towards practical traffic control using policy gradients. Technical report, Albert-Ludwigs-Universitat Freiburg (2006)
Google Scholar
Rosenblatt, J.K.: Optimal selection of uncertain actions by maximizing expected utility. Autonomous Robots 9(1), 17–25 (2000)
Article Google Scholar
Russell, S., Norvig, P.: Aritifical Intelligence - A Modern Approach. Prentice Hall, Englewood Cliffs (2003)
Google Scholar
Russell, S.J., Zimdars, A.: Q-decomposition for reinforcement learning agents. In: Fawcett, T., Mishra, N. (eds.) International Conference on Machine Learning, pp. 656–663. AAAI Press, Menlo Park (2003)
Google Scholar
Salkham, A., Cunningham, R., Garg, A., Cahill, V.: A collaborative reinforcement learning approach to urban traffic control optimization. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 2, pp. 560–566 (2008)
Google Scholar
Schneider, J., Wong, W.-K., Moore, A., Riedmiller, M.: Distributed value functions. In: Proceedings of the Sixteenth International Conference on Machine Learning, pp. 371–378. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Shelton, C.R.: Balancing multiple sources of reward in reinforcement learning. In: Neural Information Processing Systems, pp. 1082–1088 (2000)
Google Scholar
Sprague, N., Ballard, D.: Multiple-goal reinforcement learning with modular Sarsa(0). In: International Joint Conference on Artificial Intelligence (2003)
Google Scholar
Srinivasan, D., Choy, M.C., Cheu, R.L.: Neural networks for real-time traffic signal control. IEEE Transactions on Intelligent Transportation Systems 7(3), 261–272 (2006)
Article Google Scholar
Subramanian, D., Druschel, P., Chen, J.: Ants and reinforcement learning: A case study in routing in dynamic networks. In: IJCAI (2), pp. 832–838. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Suton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford Book/The MIT Press, Cambridge (1998)
Google Scholar
Tan, K.C., Lee, E.F.K., Heng, T.: Multiobjective Evolutionary Algorithms and Applications, Advanced Information and Knowledge Processing. Springer, New York (2005)
Google Scholar
Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 330–337. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Tesauro, G.: Pricing in agent economies using neural networks and multi-agent Q-learning. In: Proceedings of Workshop ABS-3: Learning About, From and With other Agents (1999)
Google Scholar
Tesauro, G.: Reinforcement learning in autonomic computing: A manifesto and case studies. IEEE Internet Computing 11(1), 22–30 (2007)
Article Google Scholar
Tesauro, G., Chess, D.M., Walsh, W.E., Das, R., Segal, A., Whalley, I., Kephart, J.O., White, S.R.: A multi-agent systems approach to autonomic computing. In: International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 464–471 (2004)
Google Scholar
Tesauro, G., Das, R., Walsh, W.E., Kephart, J.O.: Utility-function-driven resource allocation in autonomic systems. In: International Conference on Autonomic Computing, pp. 342–343 (2005)
Google Scholar
Tham, C.K., Prager, R.W.: A modular Q-learning architecture for manipulator task decomposition. In: Proceedings of the Eleventh International Conference on Machine Learning. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Van Veldhuizen, D.A., Lamont, G.B.: Multiobjective evolutionary algorithms: Analyzing the state-of-the-art. Evolutionary Computation 8(2), 125–147 (2000)
Article Google Scholar
Vlassis, N.: A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence. Morgan and Claypool Publishers (2007)
Google Scholar
Watkins, C.J.C.H., Dayan, P.: Technical note: Q-learning. Machine Learning 8(3), 279–292 (1992)
MATH Google Scholar
Weijters, A.J.M.M., Hoppenbrouwers, G.A.J.: Backpropagation networks for grapheme-phoneme conversion: a non-technical introduction. In: Artificial Neural Networks: An Introduction to ANN Theory and Practice, London, UK, pp. 11–36. Springer, Heidelberg (1995)
Google Scholar
Yagan, D., Tham, C.-K.: Coordinated reinforcement learning for decentralized optimal control. In: IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (2007)
Google Scholar
Yang, Z., Chen, X., Tang, Y., Sun, J.: Intelligent cooperation control of urban traffic networks. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, pp. 1482–1486 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Lero – The Irish Software Engineering Research Centre, Distributed Systems Group, School of Computer Science and Statistics, Trinity College Dublin,
Ivana Dusparic & Vinny Cahill

Authors

Ivana Dusparic
View author publications
You can also search for this author in PubMed Google Scholar
Vinny Cahill
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Katholieke Universiteit Leuven, Belgium
Danny Weyns
George Manson University, Fairfax, USA
Sam Malek
Computing Laboratory, University of Kent, CT2 7NF, Canterbury, Kent, UK
Rogério de Lemos
Linnaeus University, Växjö, Sweden
Jesper Andersson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dusparic, I., Cahill, V. (2010). Multi-policy Optimization in Self-organizing Systems. In: Weyns, D., Malek, S., de Lemos, R., Andersson, J. (eds) Self-Organizing Architectures. SOAR 2009. Lecture Notes in Computer Science, vol 6090. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14412-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-14412-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14411-0
Online ISBN: 978-3-642-14412-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics