Reasoning and Learning for Awareness and Adaptation

Hölzl, Matthias; Gabor, Thomas

doi:10.1007/978-3-319-16310-9_7

Matthias Hölzl¹⁷ &
Thomas Gabor¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8998))

1164 Accesses
22 Citations

Abstract

Reasoning and learning for awareness and adaptation are challenging endeavors since cogitation has to be tightly integrated with action execution and reaction to unforeseen contingencies. After discussing the notion of awareness and presenting a classification scheme for awareness mechanisms, we introduce Extended Behavior Trees (XBTs), a novel modeling method for hierarchical, concurrent behaviors that allows the interleaving of reasoning, learning and actions. The semantics of XBTs are defined by a transformation to SCEL so that sophisticated synchronization strategies are straightforward to realize and different kinds of distributed, hierarchical learning and reasoning—from centrally coordinated to fully autonomic—can easily be expressed. We propose novel hierarchical reinforcement-learning strategies called Hierarchical (Lenient) Frequency-Adjusted Q-learning, that can be implemented using XBTs. Finally we discuss how XBTs can be used to define a multi-layer approach to learning, called teacher-student learning, that combines centralized and distributed learning in a seamless way.

This research was supported by the European project IP 257414 (ASCENS).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abeywickrama, D., Zambonelli, F.: Model Checking Goal-oriented Requirements for Self-Adaptive Systems. In: 19th IEEE Conference on the Engineering of Computer-based Systems, Novi Sad, Serbia, April 2012, IEEE CS Press, Los Alamitos (2012), http://pmi.ascens-ist.eu/text_files/0000/0017/ECBS12.pdf
Google Scholar
Agogino, A.K., Tumer, K.: Analyzing and visualizing multiagent rewards in dynamic and stochastic domains. Autonomous Agents and Multi-Agent Systems 17(2), 320–338 (2008), doi:10.1007/s10458-008-9046-9
Article Google Scholar
Alpaydin, E.: Introduction to Machine Learning, 2nd edn. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2010)
MATH Google Scholar
Anderson, M.L., Perlis, D.: Logic, self-awareness and self-improvement: the metacognitive loop and the problem of brittleness. J. Log. Comput. 15(1), 21–40 (2005)
Article MATH MathSciNet Google Scholar
Andre, D.: Programmable Reinforcement Learning Agents. Ph.D. thesis, University of California at Berkeley (2003)
Google Scholar
Au, T., Ilghami, O., Kuter, U., Murdock, J.W., Nau, D.S., Wu, D., Yaman, F.: SHOP2: an HTN planning system. CoRR abs/1106.4869 (2011), http://arxiv.org/abs/1106.4869
Bloembergen, D., Kaisers, M., Tuyls, K.: Lenient frequency adjusted Q-learning. In: Proc. of 22nd Belgium-Netherlands Conf. on Artificial Intelligence (BNAIC 2010), pp. 19–26 (2010)
Google Scholar
Börgers, T., Sarin, R.: Learning Through Reinforcement and Replicator Dynamics. Journal of Economic Theory 77, 1–14 (1997)
Article MathSciNet Google Scholar
Bruni, R., Corradini, A., Gadducci, F., Hölzl, M., Lafuente, A.L., Vandin, A., Wirsing, M.: Reconciling White-Box and Black-Box Perspectives on Behavioral Self-adaptation. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 163–184. Springer, Heidelberg (2015)
Google Scholar
Busoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2012)
Google Scholar
Colvin, R.J., Hayes, I.J.: A semantics for Behavior Trees using {CSP} with specification commands. Science of Computer Programming 76(10), 891–914 (2011), http://www.sciencedirect.com/science/article/pii/S0167642310002066
Article MATH Google Scholar
Dinu, C.M., Dimitrov, P., Weel, B., Eiben, A.E.: Self-adapting fitness evaluation times for on-line evolution of simulated robots. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. GECCO ’13, pp. 191–198. ACM Press, New York (2013), doi:10.1145/2463372.2463405
Chapter Google Scholar
Drusinsky, D.: Modeling and Verification Using UML Statecharts. Elsevier, Amsterdam (2006)
Google Scholar
Endsley, M.: Design and evaluation for situation awareness enhancement. In: Proceedings of the Human Factors Society 32nd Annual Meeting, pp. 97–101. Human Factors Society (1988)
Google Scholar
Gallup, G.G.: Self recognition in primates: A comparative approach to the bidirectional properties of consciousness. American Psychologist 32(5), 329–338 (1977)
Article Google Scholar
Games, E.: How Unreal Engine 4 Behavior Trees Differ (2014), https://docs.unrealengine.com/latest/INT/Engine/AI/BehaviorTrees/HowUE4BehaviorTreesDiffer/index.html , last accessed 2014-11-28
Ghallab, M., Nau, D.S., Traverso, P.: Automated planning - theory and practice. Elsevier, Amsterdam (2004)
Google Scholar
Ghallab, M., Nau, D.S., Traverso, P.: The actor’s view of automated planning and acting: A position paper. Artif. Intell. 208, 1–17 (2014), doi:10.1016/j.artint.2013.11.002
Article Google Scholar
Hoch, N., Monreale, G.V., Montanari, U., Sammartino, M., Siwe, A.T.: From Local to Global Knowledge and Back. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 185–220. Springer, Heidelberg (2015)
Google Scholar
Hölzl, M., Koch, N., Puviani, M., Wirsing, M., Zambonelli, F.: The Ensemble Development Life Cycle and Best Practices for Collective Autonomic Systems. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 325–354. Springer, Heidelberg (2015)
Google Scholar
Hölzl, M., Wirsing, M.: Issues in engineering self-aware and self-expressive ensembles. In: Pitt, J. (ed.) The Computer After Me: Awareness and Self-awareness in Autonomic Systems, October 2014, Imperial College Press (2014)
Google Scholar
Hölzl, M.M., Wirsing, M.: Towards a system model for ensembles. In: Agha, G., Danvy, O., Meseguer, J. (eds.) Formal Modeling: Actors, Open Systems, Biological Systems. LNCS, vol. 7000, pp. 241–261. Springer, Heidelberg (2011)
Chapter Google Scholar
Isla, D.: Handling complexity in the halo 2 ai. In: Proceedings of the Game Developer’s Conference 2005 (GDC2005) (2005), http://www.gamasutra.com/view/feature/130663/gdc_2005_proceeding_handling_.php , last accessed 2014-11-28
Kaisers, M., Tuyls, K.: Frequency adjusted multi-agent q-learning. In: van der Hoek, W., Kaminka, G.A., Lespérance, Y., Luck, M., Sen, S. (eds.) 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), vol. 1–3, Toronto, Canada, May 10-14, 2010, pp. 309–316. ACM Press, New York (2010), doi:10.1145/1838206.1838250
Google Scholar
Karafotias, G., Haasdijk, E., Eiben, A.E.: An algorithm for distributed on-line, on-board evolutionary robotics. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, pp. 171–178. ACM Press, New York (2011), doi:10.1145/2001576.2001601
Google Scholar
Lewis, P.R., Chandra, A., Parsons, S., Robinson, E., Glette, K., Bahsoon, R., Torresen, J., Yao, X.: A Survey of Self-Awareness and Its Application in Computing Systems (2011)
Google Scholar
Marthi, B.: Concurrent hierarchical reinforcement learning. In: Veloso, M.M., Kambhampati, S. (eds.) Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, Pittsburgh, Pennsylvania, USA, July 9-13, 2005, pp. 1652–1653. AAAI Press / The MIT Press (2005), http://www.aaai.org/Library/AAAI/2005/dc05-009.php
Google Scholar
Marzinotto, A., Colledanchise, M., Smith, C., Ögren, P.: Towards a unified behavior trees framework for robot control. In: 2014 IEEE International Conference on Robotics and Automation, ICRA 2014, Hong Kong, China, May 31 - June 7, 2014, pp. 5420–5427. IEEE Computer Society Press, Los Alamitos (2014), doi:10.1109/ICRA.2014.6907656
Chapter Google Scholar
Millington, I., Funge, J.: Artificial Intelligence for Games, 2nd edn. Morgan Kaufmann, San Francisco (2009)
Google Scholar
Mitchell, M.: Self-awareness and control in decentralized systems. In: Metacognition in Computation, pp. 80–85 (2005)
Google Scholar
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2013)
Google Scholar
De Nicola, R., Latella, D., Lafuente, A.L., Loreti, M., Margheri, A., Massink, M., Morichetta, A., Pugliese, R., Tiezzi, F., Vandin, A.: The SCEL Language: Design, Implementation, Verification. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 3–71. Springer, Heidelberg (2015)
Google Scholar
Ogren, P.: Increasing Modularity of UAV Control Systems using Computer Game Behavior Trees. AIAA Guidance, Navigation and Control Conference, Minneapolis, Minnesota, pp. 13–16 (2012)
Google Scholar
Pinciroli, C., Bonani, M., Mondada, F., Dorigo, M.: Adaptation and Awareness in Robot Ensembles: Scenarios and Algorithms. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 471–494. Springer, Heidelberg (2015)
Google Scholar
Pinciroli, C., Trianni, V., O’Grady, R., Pini, G., Brutschy, A., Brambilla, M., Mathews, N., Ferrante, E., Caro, G.D., Ducatelle, F., Stirling, T.S., Gutiérrez, Á., Gambardella, L.M., Dorigo, M.: ARGoS: A modular, multi-engine simulator for heterogeneous swarm robotics. In: IROS, pp. 5027–5034. IEEE Computer Society Press, Los Alamitos (2011)
Google Scholar
Schwartz, H.M.: Multi-Agent Machine Learning: A Reinforcement Approach. Wiley, Chichester (2014)
Book Google Scholar
Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, New York (2008)
Book Google Scholar
Smith, B.C.: Reflection and semantics in LISP. In: POPL ’84: Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pp. 23–35. ACM Press, New York (1984)
Chapter Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)
Google Scholar
Tanaka, K., Wakuta, K.: On Continuous Time Markov Games With The Expected Average Reward Criterion. Science Reports of Niigata University. Series A, Mathematics 14, 15–24 (1977), http://projecteuclid.org/euclid.nihmj/1273779029
MATH MathSciNet Google Scholar
Tomassini, M.: Spatially Structured Evolutionary Algorithms: Artificial Evolution in Space and Time. Natural Computing Series. Springer, Heidelberg (2005), http://books.google.de/books?id=z7Hf6bL3x7MC
Google Scholar
Vassev, E., Hinchey, M.: Knowledge Representation for Adaptive and Self-aware Systems. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 221–247. Springer, Heidelberg (2015)
Google Scholar
Watkins, C.: Learning from Delayed Rewards. Ph.D. thesis, Cambridge (1989)
Google Scholar
Watson, R.A., Ficici, S.G., Pollack, J.B.: Embodied evolution: Distributing an evolutionary algorithm in a population of robots. Robotics and Autonomous Systems 39(1), 1–18 (2002), http://dblp.uni-trier.de/db/journals/ras/ras39.html#WatsonFP02
Article Google Scholar
Weiss, G. (ed.): Multiagent Systems, 2nd edn. MIT Press, Cambridge (2013)
Google Scholar
Wiering, M., van Otterlo, M. (eds.): Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12. Springer, Heidelberg (2012)
Google Scholar
Zambonelli, F., Bicocchi, N., Cabri, G., Leonardi, L., Puviani, M.: On self-adaptation, self-expression, and self-awareness in autonomic service component ensembles. In: SASO Workshops, pp. 108–113. IEEE Computer Society Press, Los Alamitos (2011)
Google Scholar
Zhang, G., Hölzl, M.M.: HiLA: High-Level Aspects for UML State Machines. In: Ghosh, S. (ed.) MODELS Workshops 2009. LNCS, vol. 6002, pp. 104–118. Springer, Heidelberg (2010)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Ludwig-Maximilians-Universität München, Germany
Matthias Hölzl & Thomas Gabor

Authors

Matthias Hölzl
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Gabor
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

, Institut für Informatik, Ludwig-Maximilians-Universität, Oettingenstraße 67, 80538, München, Germany
Martin Wirsing
Institut für Informatik, Ludwig-Maximilians-Universität, Oettingenstraße 67, 80538, München, Germany
Matthias Hölzl , Nora Koch & Philip Mayer , &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hölzl, M., Gabor, T. (2015). Reasoning and Learning for Awareness and Adaptation. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds) Software Engineering for Collective Autonomic Systems. Lecture Notes in Computer Science, vol 8998. Springer, Cham. https://doi.org/10.1007/978-3-319-16310-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-16310-9_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16309-3
Online ISBN: 978-3-319-16310-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics