Abstract
For combinatorial search in single-player games nested Monte-Carlo search is an apparent alternative to algorithms like UCT that are applied in two-player and general games. To trade exploration with exploitation the randomized search procedure intensifies the search with increasing recursion depth. If a concise mapping from states to actions is available, the integration of policy learning yields nested rollout with policy adaptation (NRPA), while Beam-NRPA keeps a bounded number of solutions in each recursion level. In this paper we propose refinements for Beam-NRPA that improve the runtime and the solution diversity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We used one core of an Intel\(^{\textregistered {}}\) Core™ i5-2520M CPU @ 2.50 GHz \(\times \) 4. The computer has 8 GB of RAM but all invocations of the algorithm to any problem instance used less than 10 MB of main memory. Moreover, we had the following software infrastructure. Operating system: Ubuntu 14.04 LTS, Linux kernel: 3.13.0-74-generic, the compiler: g++ version 4.8.4, and the compiler options: -O3 -march=native -funroll-loops -std=c++11 -Wall.
- 2.
- 3.
- 4.
The sequence of cities we found was 73, 22, 72, 54, 24, 80, 12, 0, 65, 71, 71, 20, 32, 70, 0, 92, 37, 98, 91, 16, 86, 85, 97, 13, 0, 83, 45, 61, 84, 5, 60, 89, 0, 94, 96, 99, 6, 0, 50, 33, 30, 51, 9, 67, 1, 0, 14, 44, 38, 43, 100, 95, 0, 27, 69, 76, 79, 68, 0, 52, 7, 11, 19, 49, 48, 82, 0, 28, 29, 78, 34, 35, 3, 77, 0, 62, 88, 8, 46, 17, 93, 59, 0, 36, 47, 18, 0, 39, 23, 67, 55, 4, 25, 26, 0, 63, 64, 90, 10, 31, 0, 87, 57, 2, 58, 0, 40, 53, 0, 42, 15, 41, 75, 56, 74, 21, 0.
References
Biedl, T.C., Demaine, E.D., Demaine, M.L., Fleischer, R., Jacobsen, L., Munro, J.I.: The complexity of clickomania. CoRR, cs.CC/0107031 (2001)
Bouzy, B.: An experimental investigation on the pancake problem. In: Cazenave, T., Winands, M.H.M., Edelkamp, S., Schiffel, S., Thielscher, M., Togelius, J. (eds.) CGW 2015/GIGA 2015. CCIS, vol. 614, pp. 30–43. Springer, Heidelberg (2016). doi:10.1007/978-3-319-39402-2_3
Browne, C.B., Powley, E., Whitehouse, D., Lucas, S.M., Cowling, P., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4, 1–43 (2004)
Cazenave, T.: Nested Monte-Carlo search. In: IJCAI, pp. 456–461 (2009)
Cazenave, T.: Monte-Carlo beam search. IEEE Trans. Comput. Intell. AI Games 4(1), 68–72 (2012)
Cazenave, T., Teytaud, F.: Beam nested rollout policy adaptation. In: ECAI-Workshop on Computer Games, pp. 1–12 (2012)
Edelkamp, S., Gath, M., Rohde, M.: Monte-Carlo tree search for 3D packing with object orientation. In: Lutz, C., Thielscher, M. (eds.) KI 2014. LNCS, vol. 8736, pp. 285–296. Springer, Heidelberg (2014)
Edelkamp, S., Gath, M.: Pickup-and-delivery problems with time windows and capacity constraints using nested Monte-Carlo search. In: ICAART (2014)
Edelkamp, S., Gath, M., Cazenave, T., Teytaud, F.: Algorithm and knowledge engineering for the TSPTW problem. In: IEEE SSCI (2013)
Gath, M., Herzog, O., Edelkamp, S.: Agent-based planning and control for groupage traffic. In: IEEE-CEWIT (2013)
Huang, S.-C., Arneson, B., Hayward, R.B., Müller, M., Pawlewicz, J.: MoHex 2.0: a pattern-based MCTS hex player. In: Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2013. LNCS, vol. 8427, pp. 60–71. Springer, Heidelberg (2014)
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Palombo, A., Stern, R., Puzis, R., Felner, A., Kiesel, S., Ruml, W.: Solving the snake in the box problem with heuristic search: first results. In: Proceedings of the Eighth Annual Symposium on Combinatorial Search, SOCS 2015, 11–13 June 2015, Ein Gedi, The Dead Sea, Israel, pp. 96–104 (2015)
Rosin, C.D.: Nested rollout policy adaptation for Monte-Carlo tree search. In: IJCAI, pp. 649–654 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Edelkamp, S., Cazenave, T. (2016). Improved Diversity in Nested Rollout Policy Adaptation. In: Friedrich, G., Helmert, M., Wotawa, F. (eds) KI 2016: Advances in Artificial Intelligence. KI 2016. Lecture Notes in Computer Science(), vol 9904. Springer, Cham. https://doi.org/10.1007/978-3-319-46073-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-46073-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46072-7
Online ISBN: 978-3-319-46073-4
eBook Packages: Computer ScienceComputer Science (R0)