Abstract
Self-organizing systems are often implemented as collections of collaborating agents. Such agents may need to optimize their own performance according to multiple policies as well as contribute to the optimization of overall system performance towards a potentially different set of policies. These policies can be heterogeneous, i.e., be implemented on different sets of agents, be active at different times and have different levels of priority, leading to the heterogeneity of the agents of which the system is composed. Numerous biologically-inspired techniques as well as techniques from artificial intelligence have been used to implement such self-organizing systems. In this paper we review the most commonly used techniques for multi-policy optimization in such systems, specifically, those based on ant colony optimization, evolutionary algorithms, particle swarm optimization and reinforcement learning (RL). We analyze the characteristics and existing applications of the reviewed algorithms, assessing their suitability for particular types of optimization problems, based on the environment and policy characteristics. We focus on RL, as it is considered particularly suitable for large-scale self-organizing systems due to its ability to take into account the long-term consequences of the actions executed. Therefore, RL enables the system to learn not only the immediate payoffs of its actions, but also the best actions for the long-term performance of the system. Existing RL implementations mostly focus on optimization towards a single system policy, while most multi-policy RL-based optimization techniques have so far been implemented only on a single agent. We argue that, in order to be more widely utilized as a technique for self-optimization, RL needs to address both multiple policies and multiple agents simultaneously, and analyze the challenges associated with extending existing or developing new RL optimization techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Angus, D., Woodward, C.: Multiple objective ant colony optimisation. Swarm Intelligence 3(1), 69–85 (2009)
Babaoglu, O., Meling, H., Montresor, A.: Anthill: A framework for the development of agent-based peer-to-peer systems. In: International Conference on Distributed Computing Systems (2002)
Baird, L., Moore, A.: Gradient descent for general reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 11, pp. 968–974. MIT Press, Cambridge (1999)
Baran, B., Schaerer, M.: A multiobjective ant colony system for vehicle routing problem with time windows. In: Proceedings of IASTED International Conference on Applied Informatics (2003)
Barrett, L., Narayanan, S.: Learning all optimal policies with multiple criteria. In: ICML 2008: Proceedings of the 25th International Conference on Machine Learning, pp. 41–47 (2008)
Bigus, J.P., Schlosnagle, D.A., Pilgrim, J.R., Nathaniel Mills III, W., Diao, Y.: Able: A toolkit for building multiagent autonomic systems. IBM Systems Journal 41(3), 350–371 (2002)
Blum, C., Merkle, D. (eds.): Swarm Intelligence: Introduction and Applications. Natural Computing Series. Springer, Heidelberg (2008)
Brooks, R.: Achieving artificial intelligence through building robots. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA (1986)
Brooks, R.A.: How to build complete creatures rather than isolated cognitive simulators. In: Architectures for Intelligence, pp. 225–239. Erlbaum, Mahwah (1991)
Busoniu, L., Schutter, B.D., Babuska, R.: Learning and coordination in dynamic multiagent systems. Technical Report 05-019, Delft Center for Systems and Control, Delft University of Technology, Delft, The Netherlands (October 2005)
Cantu Paz, E., Kamath, C.: An empirical comparison of combinations of evolutionary algorithms and neural networks for classification problems 35(5), 915–927 (October 2005)
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence, pp. 746–752. AAAI Press, Menlo Park (1998)
Coello, C.A.C.: A comprehensive survey of evolutionary-based multiobjective optimization techniques. Knowledge and Information Systems 1, 269–308 (1999)
Cuayahuitl, H., Renals, S., Lemon, O., Shimodaira, H.: Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces. International Journal of Game Theory, 547–565 (2006)
Cui, X., Potok, T., Palathingal, P.: Document clustering using particle swarm optimization. In: Swarm Intelligence Symposium (2005)
Di Caro, G., Dorigo, M.: AntNet: Distributed Stigmergetic Control for Communication Networks. Journal of Artificial Intelligence Research 9, 317–365 (1998)
Di Caro, G., Ducatelle, F., Gambardella, L.M.: AntHocNet: An adaptive nature-inspired algorithm for routing in mobile ad hoc networks. European Transactions on Telecommunications, Special Issue on Self-organization in Mobile Networking 16, 443–455 (2005)
Di Marzo Serugendo, G., Gleizes, M.-P., Karageorgos, A.: Self-organization in multi-agent systems. Knowl. Eng. Rev. 20(2), 165–189 (2005)
Doerner, K., Hartl, R., Reimann, M.: Are COMPETants more competent for problem solving? - the case of full truckload transportation. Central European Journal of Operations Research 11(2), 115–141 (2003)
Dorigo, M., Di Caro, G.D.: The Ant Colony Optimization Meta-Heuristic, pp. 11–32. McGraw-Hill, London (1999)
Dowling, J.: The Decentralised Coordination of Self-Adaptive Components for Autonomic Distributed Systems. PhD thesis, Trinity College Dublin (2005)
Dowling, J., Cunningham, R., Curran, E., Cahill, V.: Building autonomic systems using collaborative reinforcement learning. Knowledge Engineering Review 21(3), 231–238 (2006)
Dowling, J., Haridi, S.: Decentralized Reinforcement Learning for the Online Optimization of Distributed Systems. In: Reinforcement Learning. I-Tech Education and Publishing (2008)
Dusparic, I., Cahill, V.: Distributed W-Learning: Multi-policy optimization in self-organizing systems. In: Third IEEE International Conference on Self-Adaptive and Self-Organizing Systems (2009)
Eiben, A., Smith, J.: Introduction to Evolutionary Computing. Natural Computing Series. Springer, Heidelberg (2003)
Eiben, A.E.: Evolutionary computing and autonomic computing: Shared problems, shared solutions? In: Babaoğlu, Ö., Jelasity, M., Montresor, A., Fetzer, C., Leonardi, S., van Moorsel, A., van Steen, M. (eds.) SELF-STAR 2004. LNCS, vol. 3460, pp. 36–48. Springer, Heidelberg (2005)
Gábor, Z., Kalmár, Z., Szepesvári, C.: Multi-criteria reinforcement learning. In: ICML 1998: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 197–205. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Gadanho, S.C., Hallam, J.: Robot learning driven by emotions. Adaptive Behaviour 9(1), 42–64 (2001)
Gambardella, L.M., Taillard, E., Agazzi, G.: MACS-VRPTW: a multiple ant colony system for vehicle routing problems with time windows, pp. 63–76 (1999)
Garcia-Martinez, C., Cordon, O., Herrera, F.: A taxonomy and an empirical analysis of multiple objective ant colony optimization algorithms for the bi-criteria tsp. European Journal of Operational Research 180(1), 116–148 (2007)
Goldman, C.V., Zilberstein, S.: Optimizing information exchange in cooperative multi-agent systems. In: AAMAS 2003: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 137–144. ACM, New York (2003)
Goldman, C.V., Zilberstein, S.: Decentralized control of cooperative systems: Categorization and complexity analysis. Journal of Artificial Intelligence Research (JAIR) 22, 143–174 (2004)
Guestrin, C., Koller, D., Parr, R.: Multiagent planning with factored MDPs. In: 14th Neural Information Processing Systems (NIPS-14), Vancouver, Canada, pp. 1523–1530 (December 2001)
Guestrin, C., Lagoudakis, M., Parr, R.: Coordinated reinforcement learning. In: Proceedings of the ICML 2002 The Nineteenth International Conference on Machine Learning, pp. 227–234 (2002)
Hiraoka, K., Yoshida, M., Mishima, T.: Parallel reinforcement learning for weighted multi-criteria model with adaptive margin. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part I. LNCS, vol. 4984, pp. 487–496. Springer, Heidelberg (2008)
Hoar, R., Penner, J., Jacob, C.: Evolutionary swarm traffic: if ant roads had traffic lights. In: CEC 2002: Proceedings of the Evolutionary Computation on 2002, CEC 2002. Proceedings of the 2002 Congress, Washington, DC, USA, pp. 1910–1915. IEEE Computer Society, Los Alamitos (2002)
Humphrys, M.: Action Selection methods using Reinforcement Learning. PhD thesis, University of Cambridge (1996)
Kadrovach, B.A., Lamont, G.B.: A particle swarm model for swarm-based networked sensor systems. In: SAC 2002: Proceedings of the 2002 ACM Symposium on Applied Computing, pp. 918–924. ACM, New York (2002)
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 99–134 (1995)
Kalyanmoy, D.: Multi-Objective Optimization Using Evolutionary Algorithms. Wiley, Chichester (2001)
Karlsson, J.: Learning to solve multiple goals. PhD thesis, Rochester, NY, USA (1997)
Kennedy, J., Russell, E.C.: Swarm Intelligence, The Morgan Kaufmann Series in Artificial Intelligence. Morgan Kaufmann, San Francisco (March 2001)
Kephart, J.O., Walsh, W.E.: An artificial intelligence perspective on autonomic computing policies. In: IEEE International Workshop on Policies for Distributed Systems and Networks (2004)
Kok, J.R., ’t Hoen, P.J., Bakker, B., Vlassis, N.: Utile coordination: learning interdependencies among cooperative agents. In: Proceedings of the IEEE Symposium on Computational Intelligence and Games (CIG), Colchester, United Kingdom, pp. 29–36 (April 2005)
Kok, J.R., Vlassis, N.: Collaborative multiagent reinforcement learning by payoff propagation. Journal of Machine Learning Research 7, 1789–1828 (2006)
Lekavy, M.: Optimising Multi-agent Cooperation using Evolutionary Algorithm. In: Bielikova, M. (ed.) Proceedings of IIT.SRC 2005: Student Research Conference in Informatics and Information Technologies, Bratislava, pp. 49–56. Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava (April 2005)
Littman, M.L., Ravi, N., Fenson, E., Howard, R.: Reinforcement learning for autonomic network repair. In: ICAC 2004: Proceedings of the First International Conference on Autonomic Computing, Washington, DC, USA, pp. 284–285. IEEE Computer Society, Los Alamitos (2004)
Maniezzo, V., Gambardella, L.M., Luigi, F.D.: Ant Colony Optimization. In: New Optimization Techniques in Engineering. Springer, Heidelberg (2004)
Mariano, C., Morales, E.F.: A new distributed reinforcement learning algorithm for multiple objective optimization problems. In: Monard, M.C., Sichman, J.S. (eds.) SBIA 2000 and IBERAMIA 2000. LNCS (LNAI), vol. 1952, pp. 290–299. Springer, Heidelberg (2000)
Melo, F., Veloso, M.: Learning of coordination: Exploiting sparse interactions in multiagent systems. In: Proceedings of the 8th International Conference on Autonomous Agents and Multi-Agent Systems (2009)
Mikami, S., Kakazu, Y.: Genetic reinforcement learning for cooperative traffic signal control. In: International Conference on Evolutionary Computation, pp. 223–228 (1994)
Montresor, A., Meling, H., Babaoglu, O.: Messor: Load-balancing through a swarm of autonomous agents. Technical Report UBLCS-02-08, Departement of Computer Science, University of Bologna, Bologna, Italy (May 2002)
Natarajan, S., Tadepalli, P.: Dynamic preferences in multi-criteria reinforcement learning. In: ICML 2005: Proceedings of the 22nd International Conference on Machine Learning, pp. 601–608. ACM, New York (2005)
Oxford. The Oxford English Dictionary. Oxford University Press (2000)
Paquet, S., Bernier, N., Chaib-draa, B.: Multi-attribute decision making in a complex multiagent environment using reinforcement learning with selective perception. In: Tawfik, A.Y., Goodwin, S.D. (eds.) Canadian AI 2004. LNCS (LNAI), vol. 3060, pp. 416–421. Springer, Heidelberg (2004)
Parsopoulos, K.E., Vrahatis, M.N.: Particle swarm optimization method in multiobjective problems. In: SAC 2002: Proceedings of the 2002 ACM Symposium on Applied Computing, pp. 603–607. ACM, New York (2002)
Perez, J., Germain-Renaud, C., Kegl, B., Loomis, C.: Grid differentiated services: A reinforcement learning approach. In: CCGRID 2008: Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid, Washington, DC, USA, pp. 287–294. IEEE Computer Society, Los Alamitos (2008)
Peshkin, L., Eung Kim, K., Meuleau, N., Kaelbling, L.P.: Learning to cooperate via policy search. In: Proceedings of the 16th Annual Conference on Uncertainty in Artificial Intelligence (UAI 2000), pp. 489–496. Morgan Kaufmann, San Francisco (2000)
Pugh, J., Zhang, Y., Martinoli, A.: Particle swarm optimization for unsupervised robotic learning. In: Swarm Intelligence Symposium, pp. 92–99 (2005)
Raicevic, P.: Parallel reinforcement learning using multiple reward signals. Neurocomputing 69(16-18), 2171–2179 (2006)
Ramdane-Cherif, A.: Toward autonomic computing: Adaptive neural network for trajectory planning. International Journal of Cognitive Informatics and Natural Intelligence 1(2), 16–33 (2007)
Reyes-Sierra, M., Coello, C.A.C.: Multi-objective particle swarm optimizers: A survey of the state-of-the-art. International Journal of Computational Intelligence Research 2(3), 287–308 (2006)
Richter, S.: Learning traffic control - towards practical traffic control using policy gradients. Technical report, Albert-Ludwigs-Universitat Freiburg (2006)
Rosenblatt, J.K.: Optimal selection of uncertain actions by maximizing expected utility. Autonomous Robots 9(1), 17–25 (2000)
Russell, S., Norvig, P.: Aritifical Intelligence - A Modern Approach. Prentice Hall, Englewood Cliffs (2003)
Russell, S.J., Zimdars, A.: Q-decomposition for reinforcement learning agents. In: Fawcett, T., Mishra, N. (eds.) International Conference on Machine Learning, pp. 656–663. AAAI Press, Menlo Park (2003)
Salkham, A., Cunningham, R., Garg, A., Cahill, V.: A collaborative reinforcement learning approach to urban traffic control optimization. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 2, pp. 560–566 (2008)
Schneider, J., Wong, W.-K., Moore, A., Riedmiller, M.: Distributed value functions. In: Proceedings of the Sixteenth International Conference on Machine Learning, pp. 371–378. Morgan Kaufmann, San Francisco (1999)
Shelton, C.R.: Balancing multiple sources of reward in reinforcement learning. In: Neural Information Processing Systems, pp. 1082–1088 (2000)
Sprague, N., Ballard, D.: Multiple-goal reinforcement learning with modular Sarsa(0). In: International Joint Conference on Artificial Intelligence (2003)
Srinivasan, D., Choy, M.C., Cheu, R.L.: Neural networks for real-time traffic signal control. IEEE Transactions on Intelligent Transportation Systems 7(3), 261–272 (2006)
Subramanian, D., Druschel, P., Chen, J.: Ants and reinforcement learning: A case study in routing in dynamic networks. In: IJCAI (2), pp. 832–838. Morgan Kaufmann, San Francisco (1998)
Suton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford Book/The MIT Press, Cambridge (1998)
Tan, K.C., Lee, E.F.K., Heng, T.: Multiobjective Evolutionary Algorithms and Applications, Advanced Information and Knowledge Processing. Springer, New York (2005)
Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 330–337. Morgan Kaufmann, San Francisco (1993)
Tesauro, G.: Pricing in agent economies using neural networks and multi-agent Q-learning. In: Proceedings of Workshop ABS-3: Learning About, From and With other Agents (1999)
Tesauro, G.: Reinforcement learning in autonomic computing: A manifesto and case studies. IEEE Internet Computing 11(1), 22–30 (2007)
Tesauro, G., Chess, D.M., Walsh, W.E., Das, R., Segal, A., Whalley, I., Kephart, J.O., White, S.R.: A multi-agent systems approach to autonomic computing. In: International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 464–471 (2004)
Tesauro, G., Das, R., Walsh, W.E., Kephart, J.O.: Utility-function-driven resource allocation in autonomic systems. In: International Conference on Autonomic Computing, pp. 342–343 (2005)
Tham, C.K., Prager, R.W.: A modular Q-learning architecture for manipulator task decomposition. In: Proceedings of the Eleventh International Conference on Machine Learning. Morgan Kaufmann, San Francisco (1994)
Van Veldhuizen, D.A., Lamont, G.B.: Multiobjective evolutionary algorithms: Analyzing the state-of-the-art. Evolutionary Computation 8(2), 125–147 (2000)
Vlassis, N.: A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence. Morgan and Claypool Publishers (2007)
Watkins, C.J.C.H., Dayan, P.: Technical note: Q-learning. Machine Learning 8(3), 279–292 (1992)
Weijters, A.J.M.M., Hoppenbrouwers, G.A.J.: Backpropagation networks for grapheme-phoneme conversion: a non-technical introduction. In: Artificial Neural Networks: An Introduction to ANN Theory and Practice, London, UK, pp. 11–36. Springer, Heidelberg (1995)
Yagan, D., Tham, C.-K.: Coordinated reinforcement learning for decentralized optimal control. In: IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (2007)
Yang, Z., Chen, X., Tang, Y., Sun, J.: Intelligent cooperation control of urban traffic networks. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, pp. 1482–1486 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dusparic, I., Cahill, V. (2010). Multi-policy Optimization in Self-organizing Systems. In: Weyns, D., Malek, S., de Lemos, R., Andersson, J. (eds) Self-Organizing Architectures. SOAR 2009. Lecture Notes in Computer Science, vol 6090. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14412-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-14412-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14411-0
Online ISBN: 978-3-642-14412-7
eBook Packages: Computer ScienceComputer Science (R0)