Abstract
Reasoning and learning for awareness and adaptation are challenging endeavors since cogitation has to be tightly integrated with action execution and reaction to unforeseen contingencies. After discussing the notion of awareness and presenting a classification scheme for awareness mechanisms, we introduce Extended Behavior Trees (XBTs), a novel modeling method for hierarchical, concurrent behaviors that allows the interleaving of reasoning, learning and actions. The semantics of XBTs are defined by a transformation to SCEL so that sophisticated synchronization strategies are straightforward to realize and different kinds of distributed, hierarchical learning and reasoning—from centrally coordinated to fully autonomic—can easily be expressed. We propose novel hierarchical reinforcement-learning strategies called Hierarchical (Lenient) Frequency-Adjusted Q-learning, that can be implemented using XBTs. Finally we discuss how XBTs can be used to define a multi-layer approach to learning, called teacher-student learning, that combines centralized and distributed learning in a seamless way.
This research was supported by the European project IP 257414 (ASCENS).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abeywickrama, D., Zambonelli, F.: Model Checking Goal-oriented Requirements for Self-Adaptive Systems. In: 19th IEEE Conference on the Engineering of Computer-based Systems, Novi Sad, Serbia, April 2012, IEEE CS Press, Los Alamitos (2012), http://pmi.ascens-ist.eu/text_files/0000/0017/ECBS12.pdf
Agogino, A.K., Tumer, K.: Analyzing and visualizing multiagent rewards in dynamic and stochastic domains. Autonomous Agents and Multi-Agent Systems 17(2), 320–338 (2008), doi:10.1007/s10458-008-9046-9
Alpaydin, E.: Introduction to Machine Learning, 2nd edn. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2010)
Anderson, M.L., Perlis, D.: Logic, self-awareness and self-improvement: the metacognitive loop and the problem of brittleness. J. Log. Comput. 15(1), 21–40 (2005)
Andre, D.: Programmable Reinforcement Learning Agents. Ph.D. thesis, University of California at Berkeley (2003)
Au, T., Ilghami, O., Kuter, U., Murdock, J.W., Nau, D.S., Wu, D., Yaman, F.: SHOP2: an HTN planning system. CoRR abs/1106.4869 (2011), http://arxiv.org/abs/1106.4869
Bloembergen, D., Kaisers, M., Tuyls, K.: Lenient frequency adjusted Q-learning. In: Proc. of 22nd Belgium-Netherlands Conf. on Artificial Intelligence (BNAIC 2010), pp. 19–26 (2010)
Börgers, T., Sarin, R.: Learning Through Reinforcement and Replicator Dynamics. Journal of Economic Theory 77, 1–14 (1997)
Bruni, R., Corradini, A., Gadducci, F., Hölzl, M., Lafuente, A.L., Vandin, A., Wirsing, M.: Reconciling White-Box and Black-Box Perspectives on Behavioral Self-adaptation. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 163–184. Springer, Heidelberg (2015)
Busoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2012)
Colvin, R.J., Hayes, I.J.: A semantics for Behavior Trees using {CSP} with specification commands. Science of Computer Programming 76(10), 891–914 (2011), http://www.sciencedirect.com/science/article/pii/S0167642310002066
Dinu, C.M., Dimitrov, P., Weel, B., Eiben, A.E.: Self-adapting fitness evaluation times for on-line evolution of simulated robots. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. GECCO ’13, pp. 191–198. ACM Press, New York (2013), doi:10.1145/2463372.2463405
Drusinsky, D.: Modeling and Verification Using UML Statecharts. Elsevier, Amsterdam (2006)
Endsley, M.: Design and evaluation for situation awareness enhancement. In: Proceedings of the Human Factors Society 32nd Annual Meeting, pp. 97–101. Human Factors Society (1988)
Gallup, G.G.: Self recognition in primates: A comparative approach to the bidirectional properties of consciousness. American Psychologist 32(5), 329–338 (1977)
Games, E.: How Unreal Engine 4 Behavior Trees Differ (2014), https://docs.unrealengine.com/latest/INT/Engine/AI/BehaviorTrees/HowUE4BehaviorTreesDiffer/index.html , last accessed 2014-11-28
Ghallab, M., Nau, D.S., Traverso, P.: Automated planning - theory and practice. Elsevier, Amsterdam (2004)
Ghallab, M., Nau, D.S., Traverso, P.: The actor’s view of automated planning and acting: A position paper. Artif. Intell. 208, 1–17 (2014), doi:10.1016/j.artint.2013.11.002
Hoch, N., Monreale, G.V., Montanari, U., Sammartino, M., Siwe, A.T.: From Local to Global Knowledge and Back. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 185–220. Springer, Heidelberg (2015)
Hölzl, M., Koch, N., Puviani, M., Wirsing, M., Zambonelli, F.: The Ensemble Development Life Cycle and Best Practices for Collective Autonomic Systems. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 325–354. Springer, Heidelberg (2015)
Hölzl, M., Wirsing, M.: Issues in engineering self-aware and self-expressive ensembles. In: Pitt, J. (ed.) The Computer After Me: Awareness and Self-awareness in Autonomic Systems, October 2014, Imperial College Press (2014)
Hölzl, M.M., Wirsing, M.: Towards a system model for ensembles. In: Agha, G., Danvy, O., Meseguer, J. (eds.) Formal Modeling: Actors, Open Systems, Biological Systems. LNCS, vol. 7000, pp. 241–261. Springer, Heidelberg (2011)
Isla, D.: Handling complexity in the halo 2 ai. In: Proceedings of the Game Developer’s Conference 2005 (GDC2005) (2005), http://www.gamasutra.com/view/feature/130663/gdc_2005_proceeding_handling_.php , last accessed 2014-11-28
Kaisers, M., Tuyls, K.: Frequency adjusted multi-agent q-learning. In: van der Hoek, W., Kaminka, G.A., Lespérance, Y., Luck, M., Sen, S. (eds.) 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), vol. 1–3, Toronto, Canada, May 10-14, 2010, pp. 309–316. ACM Press, New York (2010), doi:10.1145/1838206.1838250
Karafotias, G., Haasdijk, E., Eiben, A.E.: An algorithm for distributed on-line, on-board evolutionary robotics. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, pp. 171–178. ACM Press, New York (2011), doi:10.1145/2001576.2001601
Lewis, P.R., Chandra, A., Parsons, S., Robinson, E., Glette, K., Bahsoon, R., Torresen, J., Yao, X.: A Survey of Self-Awareness and Its Application in Computing Systems (2011)
Marthi, B.: Concurrent hierarchical reinforcement learning. In: Veloso, M.M., Kambhampati, S. (eds.) Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, Pittsburgh, Pennsylvania, USA, July 9-13, 2005, pp. 1652–1653. AAAI Press / The MIT Press (2005), http://www.aaai.org/Library/AAAI/2005/dc05-009.php
Marzinotto, A., Colledanchise, M., Smith, C., Ögren, P.: Towards a unified behavior trees framework for robot control. In: 2014 IEEE International Conference on Robotics and Automation, ICRA 2014, Hong Kong, China, May 31 - June 7, 2014, pp. 5420–5427. IEEE Computer Society Press, Los Alamitos (2014), doi:10.1109/ICRA.2014.6907656
Millington, I., Funge, J.: Artificial Intelligence for Games, 2nd edn. Morgan Kaufmann, San Francisco (2009)
Mitchell, M.: Self-awareness and control in decentralized systems. In: Metacognition in Computation, pp. 80–85 (2005)
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2013)
De Nicola, R., Latella, D., Lafuente, A.L., Loreti, M., Margheri, A., Massink, M., Morichetta, A., Pugliese, R., Tiezzi, F., Vandin, A.: The SCEL Language: Design, Implementation, Verification. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 3–71. Springer, Heidelberg (2015)
Ogren, P.: Increasing Modularity of UAV Control Systems using Computer Game Behavior Trees. AIAA Guidance, Navigation and Control Conference, Minneapolis, Minnesota, pp. 13–16 (2012)
Pinciroli, C., Bonani, M., Mondada, F., Dorigo, M.: Adaptation and Awareness in Robot Ensembles: Scenarios and Algorithms. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 471–494. Springer, Heidelberg (2015)
Pinciroli, C., Trianni, V., O’Grady, R., Pini, G., Brutschy, A., Brambilla, M., Mathews, N., Ferrante, E., Caro, G.D., Ducatelle, F., Stirling, T.S., Gutiérrez, Á., Gambardella, L.M., Dorigo, M.: ARGoS: A modular, multi-engine simulator for heterogeneous swarm robotics. In: IROS, pp. 5027–5034. IEEE Computer Society Press, Los Alamitos (2011)
Schwartz, H.M.: Multi-Agent Machine Learning: A Reinforcement Approach. Wiley, Chichester (2014)
Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, New York (2008)
Smith, B.C.: Reflection and semantics in LISP. In: POPL ’84: Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pp. 23–35. ACM Press, New York (1984)
Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)
Tanaka, K., Wakuta, K.: On Continuous Time Markov Games With The Expected Average Reward Criterion. Science Reports of Niigata University. Series A, Mathematics 14, 15–24 (1977), http://projecteuclid.org/euclid.nihmj/1273779029
Tomassini, M.: Spatially Structured Evolutionary Algorithms: Artificial Evolution in Space and Time. Natural Computing Series. Springer, Heidelberg (2005), http://books.google.de/books?id=z7Hf6bL3x7MC
Vassev, E., Hinchey, M.: Knowledge Representation for Adaptive and Self-aware Systems. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 221–247. Springer, Heidelberg (2015)
Watkins, C.: Learning from Delayed Rewards. Ph.D. thesis, Cambridge (1989)
Watson, R.A., Ficici, S.G., Pollack, J.B.: Embodied evolution: Distributing an evolutionary algorithm in a population of robots. Robotics and Autonomous Systems 39(1), 1–18 (2002), http://dblp.uni-trier.de/db/journals/ras/ras39.html#WatsonFP02
Weiss, G. (ed.): Multiagent Systems, 2nd edn. MIT Press, Cambridge (2013)
Wiering, M., van Otterlo, M. (eds.): Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12. Springer, Heidelberg (2012)
Zambonelli, F., Bicocchi, N., Cabri, G., Leonardi, L., Puviani, M.: On self-adaptation, self-expression, and self-awareness in autonomic service component ensembles. In: SASO Workshops, pp. 108–113. IEEE Computer Society Press, Los Alamitos (2011)
Zhang, G., Hölzl, M.M.: HiLA: High-Level Aspects for UML State Machines. In: Ghosh, S. (ed.) MODELS Workshops 2009. LNCS, vol. 6002, pp. 104–118. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Hölzl, M., Gabor, T. (2015). Reasoning and Learning for Awareness and Adaptation. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds) Software Engineering for Collective Autonomic Systems. Lecture Notes in Computer Science, vol 8998. Springer, Cham. https://doi.org/10.1007/978-3-319-16310-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-16310-9_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16309-3
Online ISBN: 978-3-319-16310-9
eBook Packages: Computer ScienceComputer Science (R0)