Abstract
Decentralized partially observable Markov decision process (DEC-POMDP) is an approach to model multi-robot decision making problems under uncertainty. Since it is NEXP-complete there is no efficient exact algorithm to solve these problems and in spite of the attention it has taken recently, so far only a few approximate solutions that can solve small problems have been proposed. In this study, we offer a novel approximate solution algorithm for DEC-POMDP problems using evolution strategies, and a novel approach to approximately calculate the fitness of the chromosomes which correspond to the expected reward. We also propose a new problem which is a more complex, modified version of the grid meeting problem and solve it. Our results show that our algorithm is scalable and we can solve problems that have more states than the problems attempted in previous studies.
Similar content being viewed by others
References
Akin HL (1994) Evolutionary computation: a natural answer to artificial questions. In: Proceedings of ANNAL: hints from life to artificial intelligence, METU, pp 41-52
Amato C, Bernstein DS, Zilberstein S (2006) Optimal fixed-size controllers for decentralized POMDPs, AAMAS 2006 workshop on multi-agent sequential decision making in uncertain domains, Hakodate, Japan
Back T, Schwefel HP (1993) An overview of evolutionary algorithms for parameter optimization. Evol Comput 1:1–24
Becker E, Zilberstein S, Lesser V, Goldman CV (2004) Solving transition independent decentralized Markov decision processes. J Artif Intell Res 22:423–255
Bernstein D, Zilberstein S, Immerman N (2000) The complexity of decentralized control of markov decision processes. In: Proceedings of the 16th conference on uncertainty in artificial intelligence
Bernstein D, Hansen EA, Zilberstein S (2005) Bounded policy iteration for decentralized POMDPs. In: Proceedings of the nineteenth international joint conference on artificial intelligence (IJCAI), Edinburgh, Scotland
Cassandra AR (1998) A survey of POMDP applications, AAAI fall symposium
Cogill R, Rotkowitz M, Van Roy B, Lall S (2004) An approximate dynamic programming approach to decentralized control of stochastic systems. In: Proceedings of the allerton conference on communication, control, and computing
Cohen PR (1995) Empirical methods for artificial intelligence. MIT Press, Cambridge
Fogel IJ, Owens AJ, Walsh MJ (1966) Artificial intelligence through simulated evolution. Wiley, New York
Goldman CV, Zilberstein S (2004) Decentralized control of cooperative systems: categorization and complexity analysis. J Art Intell Res 22:143–174
Hansen EA, Bernstein DS, Zilberstein S (2004) Dynamic programming for partially observable stochastic games. In: Proceedings of the nineteenth national conference on artificial intelligence (AAAI), San Jose, CA, pp 709-715
Holland J (1975) Adaptation in natural and artificial systems. University of Michigan Press
Ignat DB (1998) Genetic algorithm with punctuated equilibria: analysis of the traveling salesperson problem instance. A thesis in TCC402. School of Engineering and Applied Science University of Virginia
Jimenez F, Sanchez G, Vasant P, Verdegay J (2006) A multi-objective evolutionary approach for fuzzy optimization in production planning. 2006 IEEE international conference on systems, man, and cybernetics, Taipei, Taiwan, pp 3120–3125
Jin Y (2005) A comprehensive survey of fitness approximation in evolutionary computation. Soft Comput 9(1):3–12
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge. ISBN 0-262-11170-5
LaValle SM (2006) Planning algorithms. Cambridge University Press, Cambridge
Lin AZ, Bean JC, White CC (1999) A hybrid genetic/optimization algorithm for finite horizon partially observed Markov decision processes. In: Proceedings of the congress on evolutionary computation
Lopez E, Barea R, Bergasa LM, Escudero M (2003) Visually augmented POMDP for indoor robot navigation. In: 21th IASTED international multi-conference on applied informatics
Nair R, Pynadath D, Yokoo M, Tambe M, Marsella S (2003) Taming decentralized POMDPs: towards efficient policy computation for multiagent settings. In: Proceedings of the eighteenth international joint conference on artificial intelligence (IJCAI-03)
Pynadath DV, Tambe M (2002) The communicative multiagent team decision problem: analyzing teamwork theories and models. J Art Intell Res
Rabinovich Z, Goldman CV, Rosenschein JS (2003) The complexity of multiagent systems: the price of silence, In: Proceedings of the second international joint conference on autonomous agents and multiagent systems (AAMAS), Melbourne, Australia, pp 1102–1103
Rechenberg I (1973) Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Fromman-Holzboog Verlag, Stuttgart
Sanchez G, Jimenez F, Vasant P (2007) Fuzzy optimization with multi-objective evolutionary algorithms: a case study. In: Proceedings of the 2007 IEEE symposium on computational intelligence in multicriteria decision making (MCDM). Honolulu, Hawaii, USA, pp 58–64
Sendhoff B, Kreutz M, von Seelen W (1997) Causality and the analysis of local search in evolutionary algorithms. Internal report IRINI 97-16, Institut für Neuroinformatik, Ruhr-Universität Bochum, Germany
Seuken S, Zilberstein S (2005) Formal models and algorithms for decentralized control of multiple agents. Technical report UM-CS-2005-068, Computer Science Department, University of Massachusetts
Seuken S, Zilberstein S (2007) Memory-bounded dynamic programming for DEC-POMDPs, IJCAI
Spaan M, Vlassis N (2004) A point-based POMDP algorithm for robot planning. In: Proceedings of the IEEE international conference on robotics and automation. New Orleans, Louisiana, pp 2399–2404
Szer D, Charpillet F, Zilberstein S (2005) MAA*: a heuristic search algorithm for solving decentralized POMDPs. In: Proceedings of the twenty-first conference on uncertainty in artificial intelligence (UAI), Edinburgh, Scotland
Wang FK (2001) Confidence interval for the mean of non-normal data. Qual Reliab Eng Int 17:257–267
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by Boğaziçi University Scientific Research Projects Grant 06HA102 and by TUBITAK with Project Grant 106E172.
Rights and permissions
About this article
Cite this article
Eker, B., Akın, H.L. Using evolution strategies to solve DEC-POMDP problems. Soft Comput 14, 35–47 (2010). https://doi.org/10.1007/s00500-008-0388-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-008-0388-7