Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8998))

Abstract

Reasoning and learning for awareness and adaptation are challenging endeavors since cogitation has to be tightly integrated with action execution and reaction to unforeseen contingencies. After discussing the notion of awareness and presenting a classification scheme for awareness mechanisms, we introduce Extended Behavior Trees (XBTs), a novel modeling method for hierarchical, concurrent behaviors that allows the interleaving of reasoning, learning and actions. The semantics of XBTs are defined by a transformation to SCEL so that sophisticated synchronization strategies are straightforward to realize and different kinds of distributed, hierarchical learning and reasoning—from centrally coordinated to fully autonomic—can easily be expressed. We propose novel hierarchical reinforcement-learning strategies called Hierarchical (Lenient) Frequency-Adjusted Q-learning, that can be implemented using XBTs. Finally we discuss how XBTs can be used to define a multi-layer approach to learning, called teacher-student learning, that combines centralized and distributed learning in a seamless way.

This research was supported by the European project IP 257414 (ASCENS).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abeywickrama, D., Zambonelli, F.: Model Checking Goal-oriented Requirements for Self-Adaptive Systems. In: 19th IEEE Conference on the Engineering of Computer-based Systems, Novi Sad, Serbia, April 2012, IEEE CS Press, Los Alamitos (2012), http://pmi.ascens-ist.eu/text_files/0000/0017/ECBS12.pdf

    Google Scholar 

  2. Agogino, A.K., Tumer, K.: Analyzing and visualizing multiagent rewards in dynamic and stochastic domains. Autonomous Agents and Multi-Agent Systems 17(2), 320–338 (2008), doi:10.1007/s10458-008-9046-9

    Article  Google Scholar 

  3. Alpaydin, E.: Introduction to Machine Learning, 2nd edn. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2010)

    MATH  Google Scholar 

  4. Anderson, M.L., Perlis, D.: Logic, self-awareness and self-improvement: the metacognitive loop and the problem of brittleness. J. Log. Comput. 15(1), 21–40 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  5. Andre, D.: Programmable Reinforcement Learning Agents. Ph.D. thesis, University of California at Berkeley (2003)

    Google Scholar 

  6. Au, T., Ilghami, O., Kuter, U., Murdock, J.W., Nau, D.S., Wu, D., Yaman, F.: SHOP2: an HTN planning system. CoRR abs/1106.4869 (2011), http://arxiv.org/abs/1106.4869

  7. Bloembergen, D., Kaisers, M., Tuyls, K.: Lenient frequency adjusted Q-learning. In: Proc. of 22nd Belgium-Netherlands Conf. on Artificial Intelligence (BNAIC 2010), pp. 19–26 (2010)

    Google Scholar 

  8. Börgers, T., Sarin, R.: Learning Through Reinforcement and Replicator Dynamics. Journal of Economic Theory 77, 1–14 (1997)

    Article  MathSciNet  Google Scholar 

  9. Bruni, R., Corradini, A., Gadducci, F., Hölzl, M., Lafuente, A.L., Vandin, A., Wirsing, M.: Reconciling White-Box and Black-Box Perspectives on Behavioral Self-adaptation. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 163–184. Springer, Heidelberg (2015)

    Google Scholar 

  10. Busoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2012)

    Google Scholar 

  11. Colvin, R.J., Hayes, I.J.: A semantics for Behavior Trees using {CSP} with specification commands. Science of Computer Programming 76(10), 891–914 (2011), http://www.sciencedirect.com/science/article/pii/S0167642310002066

    Article  MATH  Google Scholar 

  12. Dinu, C.M., Dimitrov, P., Weel, B., Eiben, A.E.: Self-adapting fitness evaluation times for on-line evolution of simulated robots. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. GECCO ’13, pp. 191–198. ACM Press, New York (2013), doi:10.1145/2463372.2463405

    Chapter  Google Scholar 

  13. Drusinsky, D.: Modeling and Verification Using UML Statecharts. Elsevier, Amsterdam (2006)

    Google Scholar 

  14. Endsley, M.: Design and evaluation for situation awareness enhancement. In: Proceedings of the Human Factors Society 32nd Annual Meeting, pp. 97–101. Human Factors Society (1988)

    Google Scholar 

  15. Gallup, G.G.: Self recognition in primates: A comparative approach to the bidirectional properties of consciousness. American Psychologist 32(5), 329–338 (1977)

    Article  Google Scholar 

  16. Games, E.: How Unreal Engine 4 Behavior Trees Differ (2014), https://docs.unrealengine.com/latest/INT/Engine/AI/BehaviorTrees/HowUE4BehaviorTreesDiffer/index.html , last accessed 2014-11-28

  17. Ghallab, M., Nau, D.S., Traverso, P.: Automated planning - theory and practice. Elsevier, Amsterdam (2004)

    Google Scholar 

  18. Ghallab, M., Nau, D.S., Traverso, P.: The actor’s view of automated planning and acting: A position paper. Artif. Intell. 208, 1–17 (2014), doi:10.1016/j.artint.2013.11.002

    Article  Google Scholar 

  19. Hoch, N., Monreale, G.V., Montanari, U., Sammartino, M., Siwe, A.T.: From Local to Global Knowledge and Back. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 185–220. Springer, Heidelberg (2015)

    Google Scholar 

  20. Hölzl, M., Koch, N., Puviani, M., Wirsing, M., Zambonelli, F.: The Ensemble Development Life Cycle and Best Practices for Collective Autonomic Systems. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 325–354. Springer, Heidelberg (2015)

    Google Scholar 

  21. Hölzl, M., Wirsing, M.: Issues in engineering self-aware and self-expressive ensembles. In: Pitt, J. (ed.) The Computer After Me: Awareness and Self-awareness in Autonomic Systems, October 2014, Imperial College Press (2014)

    Google Scholar 

  22. Hölzl, M.M., Wirsing, M.: Towards a system model for ensembles. In: Agha, G., Danvy, O., Meseguer, J. (eds.) Formal Modeling: Actors, Open Systems, Biological Systems. LNCS, vol. 7000, pp. 241–261. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  23. Isla, D.: Handling complexity in the halo 2 ai. In: Proceedings of the Game Developer’s Conference 2005 (GDC2005) (2005), http://www.gamasutra.com/view/feature/130663/gdc_2005_proceeding_handling_.php , last accessed 2014-11-28

  24. Kaisers, M., Tuyls, K.: Frequency adjusted multi-agent q-learning. In: van der Hoek, W., Kaminka, G.A., Lespérance, Y., Luck, M., Sen, S. (eds.) 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), vol. 1–3, Toronto, Canada, May 10-14, 2010, pp. 309–316. ACM Press, New York (2010), doi:10.1145/1838206.1838250

    Google Scholar 

  25. Karafotias, G., Haasdijk, E., Eiben, A.E.: An algorithm for distributed on-line, on-board evolutionary robotics. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, pp. 171–178. ACM Press, New York (2011), doi:10.1145/2001576.2001601

    Google Scholar 

  26. Lewis, P.R., Chandra, A., Parsons, S., Robinson, E., Glette, K., Bahsoon, R., Torresen, J., Yao, X.: A Survey of Self-Awareness and Its Application in Computing Systems (2011)

    Google Scholar 

  27. Marthi, B.: Concurrent hierarchical reinforcement learning. In: Veloso, M.M., Kambhampati, S. (eds.) Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, Pittsburgh, Pennsylvania, USA, July 9-13, 2005, pp. 1652–1653. AAAI Press / The MIT Press (2005), http://www.aaai.org/Library/AAAI/2005/dc05-009.php

    Google Scholar 

  28. Marzinotto, A., Colledanchise, M., Smith, C., Ögren, P.: Towards a unified behavior trees framework for robot control. In: 2014 IEEE International Conference on Robotics and Automation, ICRA 2014, Hong Kong, China, May 31 - June 7, 2014, pp. 5420–5427. IEEE Computer Society Press, Los Alamitos (2014), doi:10.1109/ICRA.2014.6907656

    Chapter  Google Scholar 

  29. Millington, I., Funge, J.: Artificial Intelligence for Games, 2nd edn. Morgan Kaufmann, San Francisco (2009)

    Google Scholar 

  30. Mitchell, M.: Self-awareness and control in decentralized systems. In: Metacognition in Computation, pp. 80–85 (2005)

    Google Scholar 

  31. Murphy, K.P.: Machine Learning: A Probabilistic Perspective. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2013)

    Google Scholar 

  32. De Nicola, R., Latella, D., Lafuente, A.L., Loreti, M., Margheri, A., Massink, M., Morichetta, A., Pugliese, R., Tiezzi, F., Vandin, A.: The SCEL Language: Design, Implementation, Verification. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 3–71. Springer, Heidelberg (2015)

    Google Scholar 

  33. Ogren, P.: Increasing Modularity of UAV Control Systems using Computer Game Behavior Trees. AIAA Guidance, Navigation and Control Conference, Minneapolis, Minnesota, pp. 13–16 (2012)

    Google Scholar 

  34. Pinciroli, C., Bonani, M., Mondada, F., Dorigo, M.: Adaptation and Awareness in Robot Ensembles: Scenarios and Algorithms. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 471–494. Springer, Heidelberg (2015)

    Google Scholar 

  35. Pinciroli, C., Trianni, V., O’Grady, R., Pini, G., Brutschy, A., Brambilla, M., Mathews, N., Ferrante, E., Caro, G.D., Ducatelle, F., Stirling, T.S., Gutiérrez, Á., Gambardella, L.M., Dorigo, M.: ARGoS: A modular, multi-engine simulator for heterogeneous swarm robotics. In: IROS, pp. 5027–5034. IEEE Computer Society Press, Los Alamitos (2011)

    Google Scholar 

  36. Schwartz, H.M.: Multi-Agent Machine Learning: A Reinforcement Approach. Wiley, Chichester (2014)

    Book  Google Scholar 

  37. Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, New York (2008)

    Book  Google Scholar 

  38. Smith, B.C.: Reflection and semantics in LISP. In: POPL ’84: Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pp. 23–35. ACM Press, New York (1984)

    Chapter  Google Scholar 

  39. Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)

    Google Scholar 

  40. Tanaka, K., Wakuta, K.: On Continuous Time Markov Games With The Expected Average Reward Criterion. Science Reports of Niigata University. Series A, Mathematics 14, 15–24 (1977), http://projecteuclid.org/euclid.nihmj/1273779029

    MATH  MathSciNet  Google Scholar 

  41. Tomassini, M.: Spatially Structured Evolutionary Algorithms: Artificial Evolution in Space and Time. Natural Computing Series. Springer, Heidelberg (2005), http://books.google.de/books?id=z7Hf6bL3x7MC

    Google Scholar 

  42. Vassev, E., Hinchey, M.: Knowledge Representation for Adaptive and Self-aware Systems. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds.) Software Engineering for Collective Autonomic Systems. LNCS, vol. 8998, pp. 221–247. Springer, Heidelberg (2015)

    Google Scholar 

  43. Watkins, C.: Learning from Delayed Rewards. Ph.D. thesis, Cambridge (1989)

    Google Scholar 

  44. Watson, R.A., Ficici, S.G., Pollack, J.B.: Embodied evolution: Distributing an evolutionary algorithm in a population of robots. Robotics and Autonomous Systems 39(1), 1–18 (2002), http://dblp.uni-trier.de/db/journals/ras/ras39.html#WatsonFP02

    Article  Google Scholar 

  45. Weiss, G. (ed.): Multiagent Systems, 2nd edn. MIT Press, Cambridge (2013)

    Google Scholar 

  46. Wiering, M., van Otterlo, M. (eds.): Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12. Springer, Heidelberg (2012)

    Google Scholar 

  47. Zambonelli, F., Bicocchi, N., Cabri, G., Leonardi, L., Puviani, M.: On self-adaptation, self-expression, and self-awareness in autonomic service component ensembles. In: SASO Workshops, pp. 108–113. IEEE Computer Society Press, Los Alamitos (2011)

    Google Scholar 

  48. Zhang, G., Hölzl, M.M.: HiLA: High-Level Aspects for UML State Machines. In: Ghosh, S. (ed.) MODELS Workshops 2009. LNCS, vol. 6002, pp. 104–118. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Hölzl, M., Gabor, T. (2015). Reasoning and Learning for Awareness and Adaptation. In: Wirsing, M., Hölzl, M., Koch, N., Mayer, P. (eds) Software Engineering for Collective Autonomic Systems. Lecture Notes in Computer Science, vol 8998. Springer, Cham. https://doi.org/10.1007/978-3-319-16310-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16310-9_7

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16309-3

  • Online ISBN: 978-3-319-16310-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics