Explaining the Influence of Prior Knowledge on POMCP Policies

Castellini, Alberto; Marchesini, Enrico; Mazzi, Giulio; Farinelli, Alessandro

doi:10.1007/978-3-030-66412-1_17

Alberto Castellini¹¹,
Enrico Marchesini¹¹,
Giulio Mazzi¹¹ &
…
Alessandro Farinelli¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12520))

Included in the following conference series:

830 Accesses
2 Citations

Abstract

Partially Observable Monte Carlo Planning is a recently proposed online planning algorithm which makes use of Monte Carlo Tree Search to solve Partially Observable Monte Carlo Decision Processes. This solver is very successful because of its capability to scale to large uncertain environments, a very important property for current real-world planning problems. In this work we propose three main contributions related to POMCP usage and interpretability. First, we introduce a new planning problem related to mobile robot collision avoidance in paths with uncertain segment difficulties, and we show how POMCP performance in this context can take advantage of prior knowledge about segment difficulty relationships. This problem has direct real-world applications, such as, safety management in industrial environments where human-robot interaction is a crucial issue. Then, we present an experimental analysis about the relationships between prior knowledge provided to the algorithm and performance improvement, showing that in our case study prior knowledge affects two main properties, namely, the distance between the belief and the real state, and the mutual information between segment difficulty and action taken in the segment. This analysis aims to improve POMCP explainability, following the line of recently proposed eXplainable AI and, in particular, eXplainable planning. Finally, we analyze results on a synthetic case study and show how the proposed measures can improve the understanding about internal planning mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amato, C., Oliehoek, F.A.: Scalable planning and learning for multi-agent POMDPs. In: Proceedings of AAAI 2015, pp. 1995–2002. AAAI Press (2015)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer-Verlag, New York (2006)
MATH Google Scholar
Boutilier, C., Dean, T., Hanks, S.: Decision-theoretic Planning: structural assumptions and computational leverage. JAIR 11(1), 1–94 (1999)
Article MathSciNet Google Scholar
Browne, C., et al.: A survey of Monte Carlo tree search methods. IEEE Trans. Comp. Intell. AI Games 4(1), 1–43 (2012)
Article Google Scholar
Byrne, R.M.J.: Constraints on counterfactuals. In: Proceedings of the Workshop of Explainable Artificial Intelligence, Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI 2019 (2019)
Google Scholar
Byrne, R.M.J.: Counterfactuals in Explainable Artificial Intelligence (XAI): evidence from human reasoning. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Vol. 7, pp. 6276–6282. International Joint Conferences on Artificial Intelligence Organization (2019)
Google Scholar
Cashmore, M., Collins, A., Krarup, B., Krivic, S., Magazzeni, D., Smith, D.: Towards Explainable AI Planning as a Service (2019). CoRR, abs/1908.05059
Google Scholar
Castellini, A., et al.: Activity recognition for autonomous water drones based on unsupervised learning methods. In: Proceedings of 4th Italian Workshop on Artificial Intelligence and Robotics (AI*IA 2017), Vol. 2054, pp. 16–21 (2018)
Google Scholar
Castellini, A., et al.: Subspace clustering for situation assessment in aquatic drones: a sensitivity analysis for state-model improvement. Cybern. Syst. 50(8), 658–671 (2019)
Article Google Scholar
Castellini, A., Bicego, M., Masillo, F., Zuccotto, M., Farinelli, A.: Time series segmentation for state-model generation of autonomous aquatic drones: a systematic framework. Eng. Appl. Artif. Intel. 90, 103499 (2020)
Article Google Scholar
Castellini, A., Blum, J., Bloisi, D., Farinelli, A.: Intelligent battery management for aquatic drones based on task difficulty driven POMDPs. In: Proceedings of the 5th Italian Workshop on Artificial Intelligence and Robotics, AIRO@AI*IA 2018, pp. 24–28, Trento, Italy (2018)
Google Scholar
Castellini, A., Chalkiadakis, G., Farinelli, A.: Influence of state-variable constraints on partially observable monte carlo planning. In: Proceedings of 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), pp. 5540–5546 (2019)
Google Scholar
Castellini, A., Masillo, F., Bicego, M., Bloisi, D., Blum, J., Farinelli, A., Peigner, S.: Subspace clustering for situation assessment in aquatic drones. In: Proceedings of Symposium on Applied Computing, SAC 2019, pp. 930–937. ACM (2019)
Google Scholar
Castellini, A., Masillo, F., Sartea, R., Farinelli, A.: eXplainable Modeling (XM): Data Analysis for Intelligent Agents. In: Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019), pp. 2342–2344. IFAAMAS (2019)
Google Scholar
Chakraborti, T., et al.: Visualizations for an Explainable Planning Agent. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, Vol. 7, pp. 5820–5822. International Joint Conferences on Artificial Intelligence Organization (2018)
Google Scholar
Chakraborti, T., Sreedharan, S., Zhang, Y., Kambhampati, S.: Plan Explanations as Model Reconciliation: Moving Beyond Explanation as Soliloquy. In Proc. 26th Int. Joint Conference on Artificial Intelligence, IJCAI 2017, pp. 156–163 (2017)
Google Scholar
Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75538-8_7
Chapter Google Scholar
Feldman, J.A., Sproull, R.F.: Decision theory and artificial intelligence II: the hungry monkey. Cogn. Sci. 1(2), 158–192 (1977)
Article Google Scholar
Foka, A., Trahanias, P.: Real-time hierarchical POMDPs for autonomous robot navigation. Robot. Autonom. Syst. 55(7), 561–571 (2007)
Article Google Scholar
Fox, M., Long, D., Magazzeni, D.: Explainable Planning (2017). CoRR, abs/1709.10256
Google Scholar
Gunning, D., Aha, D.: DARPA’s explainable artificial intelligence (XAI) program. AI Magazine 40(2), 44–58 (2019)
Article Google Scholar
Hauskrecht, M.: Value-function approximations for partially observable Markov Decision processes. JAIR 13, 33–94 (2000)
Article MathSciNet Google Scholar
Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1–2), 99–134 (1998)
Article MathSciNet Google Scholar
Khatib, O.: Real-time obstacle avoidance for manipulators and mobile robots. In: Proceedings of 1985 IEEE International Conference on Robotics and Automation, Vol.2, pp. 500–505 (1985)
Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_29
Chapter Google Scholar
Lee, J., Kim, G.-H., Poupart, P., Kim, K.-E.: Monte-Carlo tree search for constrained POMDPs. In: NIPS 2018, pp. 1–17 (2018)
Google Scholar
Mahajan, D., Tan, C., Sharma, A.: Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers (2019)
Google Scholar
Papadimitriou, C., Tsitsiklis, J.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)
Article MathSciNet Google Scholar
Potena, C., Nardi, D., Pretto, A.: Joint vision-based navigation, control and obstacle avoidance for UAVs in dynamic environments. In: 2019 European Conference on Mobile Robots (ECMR), pp. 1–7 (2019)
Google Scholar
Ross, S., Pineau, J., Paquet, S., Chaib-draa, B.: Online planning algorithms for POMDPs. JAIR 32, 663–704 (2008)
Article MathSciNet Google Scholar
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Pearson Education, London (2003)
MATH Google Scholar
Silver, D., Huang, A., Maddison, C.J., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Silver, D., Schrittwieser, J., Simonyan, K., et al.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017)
Article Google Scholar
Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. In: NIPS 2010, pp. 2164–2172 (2010)
Google Scholar
Smith, D.E.: Planning as an iterative process. In: Proceedings of Twenty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2012, pp. 2180–2185. AAAI Press (2012)
Google Scholar
Sreedharan, S., Chakraborti, T., Kambhampati, S.: Handling model uncertainty and multiplicity in explanations via model reconciliation. In: Proceedings of International Conference on Automated Planning and Scheduling, ICAPS 2018, pp. 518–526 (2018)
Google Scholar
Steccanella, L., Bloisi, D.D., Castellini, A., Farinelli, A.: Waterline and obstacle detection in images from low-cost autonomous boats for environmental monitoring. Robot. Autonom. Syst. 124, 103346 (2020)
Article Google Scholar
Lorenz, U.: Leitbilder in der Künstlichen Intelligenz. Reinforcement Learning, pp. 161–170. Springer, Heidelberg (2020). https://doi.org/10.1007/978-3-662-61651-2_6
Chapter Google Scholar
Vanzo, A., Croce, D., Bastianelli, E., Basili, R., Nardi, D.: Grounded language interpretation of robotic commands through structured learning. Artif. Intell. 278, 103181 (2020)
Article MathSciNet Google Scholar
Zhang, Y., Sreedharan, S., Kulkarni, A., Chakraborti, T., Zhuo, H.H., Kambhampati, S.: Plan explicability and predictability for robot task planning. In: IEEE International Conferencce on Robotics and Automation (ICRA 2017), pp. 1313–1320 (2017)
Google Scholar

Download references

Acknowledgments

The research has been partially supported by the projects “Dipartimenti di Eccellenza 2018–2022, funded by the Italian Ministry of Education, Universities and Research (MIUR), and “GHOTEM/CORE-WOOD, POR-FESR 2014–2020”, funded by Regione del Veneto.

Author information

Authors and Affiliations

Department of Computer Science, Verona University, Strada Le Grazie 15, 37134, Verona, Italy
Alberto Castellini, Enrico Marchesini, Giulio Mazzi & Alessandro Farinelli

Authors

Alberto Castellini
View author publications
You can also search for this author in PubMed Google Scholar
Enrico Marchesini
View author publications
You can also search for this author in PubMed Google Scholar
Giulio Mazzi
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Farinelli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alberto Castellini .

Editor information

Editors and Affiliations

Aristotle University of Thessaloniki, Thessaloniki, Greece
Nick Bassiliades
Technical University of Crete, Chania, Greece
Georgios Chalkiadakis
IIIA-CSIC, Bellaterra, Spain
Dave de Jonge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Castellini, A., Marchesini, E., Mazzi, G., Farinelli, A. (2020). Explaining the Influence of Prior Knowledge on POMCP Policies. In: Bassiliades, N., Chalkiadakis, G., de Jonge, D. (eds) Multi-Agent Systems and Agreement Technologies. EUMAS AT 2020 2020. Lecture Notes in Computer Science(), vol 12520. Springer, Cham. https://doi.org/10.1007/978-3-030-66412-1_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-66412-1_17
Published: 05 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66411-4
Online ISBN: 978-3-030-66412-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics