Abstract
Companion-Systems are composed of different modules that have to share a single, sound estimate of the current situation. While the long-term decision-making of automated planning requires knowledge about the user’s goals, short-term decisions, like choosing among modes of user-interaction, depend on properties such as lighting conditions. In addition to the diverse scopes of the involved models, a large portion of the information required within such a system cannot be directly observed, but has to be inferred from background knowledge and sensory data—sometimes via a cascade of abstraction layers, and often resulting in uncertain predictions. In this contribution, we interpret an existing cognitive technical system under the assumption that it solves a factored, partially observable Markov decision process. Our interpretation heavily draws from the concepts of probabilistic graphical models and hierarchical reinforcement learning, and fosters a view that cleanly separates between inference and decision making. The results are discussed and compared to those of existing approaches from other application domains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
\(\mathcal{P}(X)\) is the set of all probability mass functions (or density functions) over the set X.
- 2.
This is also called model-free reinforcement learning, as the model has to be learned together with a policy.
References
Arel, I., Rose, D.C., Karnowski, T.P.: Deep machine learning-a new frontier in artificial intelligence research [research frontier]. IEEE Comput. Intell. Mag. 5(4), 13–18 (2010)
Åström, K.J., Kumar, P.: Control: a perspective. Automatica 50(1), 3–43 (2014)
Bellman, R.: A markovian decision process. Technical Report, DTIC Document (1957)
Bercher, P., Biundo, S., Geier, T., Hoernle, T., Nothdurft, F., Richter, F., Schattenberg, B.: Plan, repair, execute, explain - how planning helps to assemble your home theater. In: Proceedings of the 24th International Conference on Automated Planning and Scheduling (ICAPS 2014), pp. 386–394. AAAI Press, Palo Alto (2014)
Bercher, P., Richter, F., Hörnle, T., Geier, T., Höller, D., Behnke, G., Nothdurft, F., Honold, F., Minker, W., Weber, M., Biundo, S.: A planning-based assistance system for setting up a home theater. In: Proceedings of the 29th National Conference on Artificial Intelligence (AAAI 2015). AAAI Press, Palo Alto (2015)
Biundo, S., Wendemuth, A.: Companion-technology for cognitive technical systems. Künstliche Intelligenz 30(1), 71–75 (2016). doi:10.1007/s13218-015-0414-8
Botvinick, M.M.: Hierarchical reinforcement learning and decision making. Curr. Opin. Neurobiol. 22(6), 956–962 (2012)
Boutilier, C., Dean, T.L., Hanks, S.: Decision-theoretic planning: structural assumptions and computational leverage. J. Artif. Intell. Res. (JAIR) 11, 1–94 (1999). doi:10.1613/jair.575
Brusoni, S., Marengo, L., Prencipe, A., Valente, M.: The value and costs of modularity: a cognitive perspective. SPRU Electronic Working Paper Series. SPRU, Brighton (2004)
Burns, B., Morrison, C.T.: Temporal abstraction in Bayesian networks. In: AAAI Spring Symposium. Defense Technical Information Center (2003)
Cassandra, A.R., Kaelbling, L.P., Kurien, J.: Acting under uncertainty: discrete Bayesian models for mobile-robot navigation. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS 1996, November 4–8, 1996, Osaka, pp. 963–972 (1996). doi:10.1109/IROS.1996.571080
Gales, M., Young, S.: The application of hidden Markov models in speech recognition. Found. Trends Signal Process. 1(3), 195–304 (2008)
Gat, E.: Three-layer architectures. In: Kortenkamp, D., Peter Bonasso, R., Murphy, R.R. (eds.) Artificial Intelligence and Mobile Robots, pp. 195–210. AAAI Press (1998)
Geier, T., Biundo, S.: Approximate online inference for dynamic Markov logic networks. In: International IEEE Conference on Tools with Artificial Intelligence, pp. 764 –768 (2011)
Geier, T., Reuter, S., Dietmayer, K., Biundo, S.: Goal-based person tracking using a first-order probabilistic model. In: Proceedings of the Nineth UAI Bayesian Modeling Applications Workshop (2012)
Goodwin, G.C., Graebe, S.F., Salgado, M.E.: Control System Design. Prentice Hall, Upper Saddle River (2001)
Gosavi, A.: Reinforcement learning: a tutorial survey and recent advances. INFORMS J. Comput. 21(2), 178–192 (2009)
Jain, D., Barthels, A., Beetz, M.: Adaptive Markov logic networks: learning statistical relational models with dynamic parameters. In: ECAI, pp. 937–942 (2010)
Jong, N.K., Hester, T., Stone, P.: The utility of temporal abstraction in reinforcement learning. In: Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1, AAMAS’08, pp. 299–306. International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC (2008)
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1), 99–134 (1998)
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)
Kschischang, F., Frey, B., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47(2), 498–519 (2001). doi:10.1109/18.910572
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning (2001)
Lauritzen, S.L., Richardson, T.S.: Chain graph models and their causal interpretations. J. R. Stat. Soc. Ser. B Stat. Methodol. 64(3), 321–348 (2002)
Lemon, O., Cavedon, L., Kelly, B.: Managing dialogue interaction: a multi-layered approach. In: Proceedings of the 4th SIGdial Workshop on Discourse and Dialogue, pp. 168–177 (2003)
McCallum, A., Freitag, D., Pereira, F.C.N.: Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, June 29–July 2, 2000, pp. 591–598 (2000)
Montani, S., Bottrighi, A., Leonardi, G., Portinale, L.: A CBR-based, closed-loop architecture for temporal abstractions configuration. Comput. Intell. 25(3), 235–249 (2009). doi:10.1111/j.1467-8640.2009.00340.x
Murphy, K.: Dynamic Bayesian networks: representation, inference and learning. Ph.D. Thesis, University of California (2002)
Nothdurft, F., Honold, F., Zablotskaya, K., Diab, A., Minker, W.: Application of verbal intelligence in dialog systems for multimodal interaction. In: 2014 International Conference on Intelligent Environments (IE), pp. 361–364. IEEE, New York (2014)
Nothdurft, F., Richter, F., Minker, W.: Probabilistic human-computer trust handling. In: 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue, p. 51 (2014)
Orphanou, K., Keravnou, E., Moutiris, J.: Integration of temporal abstraction and dynamic Bayesian networks in clinical systems. A preliminary approach. In: Jones, A.V. (ed.) 2012 Imperial College Computing Student Workshop, OpenAccess Series in Informatics (OASIcs), vol. 28, pp. 102–108. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl (2012). doi:http://dx.doi.org/10.4230/OASIcs.ICCSW.2012.102
Orphanou, K., Stassopoulou, A., Keravnou, E.: Temporal abstraction and temporal Bayesian networks in clinical domains: a survey. Artif. Intell. Med. 60(3), 133–149 (2014). doi:http://dx.doi.org/10.1016/j.artmed.2013.12.007
Papai, T., Kautz, H., Stefankovic, D.: Slice normalized dynamic Markov logic networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1907–1915. Curran Associates, Red Hook (2012)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)
Pearl, J.: Causality: Models, Reasoning and Inference, vol. 29. Cambridge University Press, Cambridge (2000)
Rafols, E., Koop, A., Sutton, R.S.: Temporal abstraction in temporal-difference networks. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18, pp. 1313–1320. MIT Press, Cambridge (2006)
Reichenbach, H., Reichenbach, M.: The Direction of Time. Philosophy (University of California (Los Ángeles)). University of California Press, Berkeley (1991)
Ren, H., Xu, W., Zhang, Y., Yan, Y.: Dialog state tracking using conditional random fields. In: Proceedings of the SIGDIAL 2013 Conference, pp. 457–461. Association for Computational Linguistics, Metz (2013)
Reuter, S., Dietmayer, K.: Pedestrian tracking using random finite sets. In: Proceedings of the 14th International Conference on Information Fusion, pp. 1–8 (2011)
Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)
Sallans, B., Hinton, G.E.: Reinforcement learning with factored states and actions. J. Mach. Learn. Res. 5, 1063–1088 (2004)
Schüssel, F., Honold, F., Weber, M.: Using the transferable belief model for multimodal input fusion in companion systems. In: Multimodal Pattern Recognition of Social Signals in Human-Computer-Interaction, pp. 100–115. Springer, Berlin (2013)
Smets, P., Kennes, R.: The transferable belief model. Artif. Intell. 66(2), 191–234 (1994)
Sontag, E.D.: Mathematical Control Theory: Deterministic Finite Dimensional Systems, vol. 6. Springer, New York (1998)
Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1), 181–211 (1999)
Sutton, C., McCallum, A., Rohanimanesh, K.: Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data. J. Mach. Learn. Res. 8, 693–723 (2007)
Theocharous, G., Kaelbling, L.P.: Approximate planning in POMDPs with macro-actions. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 16, pp. 775–782. MIT Press, Cambridge (2004)
Von Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press, Princeton (1944)
Williams, J.D., Poupart, P., Young, S.: Factored partially observable Markov decision processes for dialogue management. In: 4th Workshop on Knowledge and Reasoning in Practical Dialog Systems, International Joint Conference on Artificial Intelligence (IJCAI), pp. 76–82 (2005)
Young, S., Gasic, M., Thomson, B., Williams, J.D.: POMDP-based statistical spoken dialog systems: a review. Proc. IEEE 101(5), 1160–1179 (2013)
Acknowledgements
This work was done within the Transregional Collaborative Research Centre SFB/TRR 62 “Companion-Technology for Cognitive Technical Systems” funded by the German Research Foundation (DFG).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Geier, T., Biundo, S. (2017). Multi-level Knowledge Processing in Cognitive Technical Systems. In: Biundo, S., Wendemuth, A. (eds) Companion Technology. Cognitive Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-43665-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-43665-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43664-7
Online ISBN: 978-3-319-43665-4
eBook Packages: Computer ScienceComputer Science (R0)