Skip to main content

Explaining the Influence of Prior Knowledge on POMCP Policies

  • Conference paper
  • First Online:
Book cover Multi-Agent Systems and Agreement Technologies (EUMAS 2020, AT 2020)

Abstract

Partially Observable Monte Carlo Planning is a recently proposed online planning algorithm which makes use of Monte Carlo Tree Search to solve Partially Observable Monte Carlo Decision Processes. This solver is very successful because of its capability to scale to large uncertain environments, a very important property for current real-world planning problems. In this work we propose three main contributions related to POMCP usage and interpretability. First, we introduce a new planning problem related to mobile robot collision avoidance in paths with uncertain segment difficulties, and we show how POMCP performance in this context can take advantage of prior knowledge about segment difficulty relationships. This problem has direct real-world applications, such as, safety management in industrial environments where human-robot interaction is a crucial issue. Then, we present an experimental analysis about the relationships between prior knowledge provided to the algorithm and performance improvement, showing that in our case study prior knowledge affects two main properties, namely, the distance between the belief and the real state, and the mutual information between segment difficulty and action taken in the segment. This analysis aims to improve POMCP explainability, following the line of recently proposed eXplainable AI and, in particular, eXplainable planning. Finally, we analyze results on a synthetic case study and show how the proposed measures can improve the understanding about internal planning mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amato, C., Oliehoek, F.A.: Scalable planning and learning for multi-agent POMDPs. In: Proceedings of AAAI 2015, pp. 1995–2002. AAAI Press (2015)

    Google Scholar 

  2. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer-Verlag, New York (2006)

    MATH  Google Scholar 

  3. Boutilier, C., Dean, T., Hanks, S.: Decision-theoretic Planning: structural assumptions and computational leverage. JAIR 11(1), 1–94 (1999)

    Article  MathSciNet  Google Scholar 

  4. Browne, C., et al.: A survey of Monte Carlo tree search methods. IEEE Trans. Comp. Intell. AI Games 4(1), 1–43 (2012)

    Article  Google Scholar 

  5. Byrne, R.M.J.: Constraints on counterfactuals. In: Proceedings of the Workshop of Explainable Artificial Intelligence, Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI 2019 (2019)

    Google Scholar 

  6. Byrne, R.M.J.: Counterfactuals in Explainable Artificial Intelligence (XAI): evidence from human reasoning. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Vol. 7, pp. 6276–6282. International Joint Conferences on Artificial Intelligence Organization (2019)

    Google Scholar 

  7. Cashmore, M., Collins, A., Krarup, B., Krivic, S., Magazzeni, D., Smith, D.: Towards Explainable AI Planning as a Service (2019). CoRR, abs/1908.05059

    Google Scholar 

  8. Castellini, A., et al.: Activity recognition for autonomous water drones based on unsupervised learning methods. In: Proceedings of 4th Italian Workshop on Artificial Intelligence and Robotics (AI*IA 2017), Vol. 2054, pp. 16–21 (2018)

    Google Scholar 

  9. Castellini, A., et al.: Subspace clustering for situation assessment in aquatic drones: a sensitivity analysis for state-model improvement. Cybern. Syst. 50(8), 658–671 (2019)

    Article  Google Scholar 

  10. Castellini, A., Bicego, M., Masillo, F., Zuccotto, M., Farinelli, A.: Time series segmentation for state-model generation of autonomous aquatic drones: a systematic framework. Eng. Appl. Artif. Intel. 90, 103499 (2020)

    Article  Google Scholar 

  11. Castellini, A., Blum, J., Bloisi, D., Farinelli, A.: Intelligent battery management for aquatic drones based on task difficulty driven POMDPs. In: Proceedings of the 5th Italian Workshop on Artificial Intelligence and Robotics, AIRO@AI*IA 2018, pp. 24–28, Trento, Italy (2018)

    Google Scholar 

  12. Castellini, A., Chalkiadakis, G., Farinelli, A.: Influence of state-variable constraints on partially observable monte carlo planning. In: Proceedings of 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), pp. 5540–5546 (2019)

    Google Scholar 

  13. Castellini, A., Masillo, F., Bicego, M., Bloisi, D., Blum, J., Farinelli, A., Peigner, S.: Subspace clustering for situation assessment in aquatic drones. In: Proceedings of Symposium on Applied Computing, SAC 2019, pp. 930–937. ACM (2019)

    Google Scholar 

  14. Castellini, A., Masillo, F., Sartea, R., Farinelli, A.: eXplainable Modeling (XM): Data Analysis for Intelligent Agents. In: Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019), pp. 2342–2344. IFAAMAS (2019)

    Google Scholar 

  15. Chakraborti, T., et al.: Visualizations for an Explainable Planning Agent. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, Vol. 7, pp. 5820–5822. International Joint Conferences on Artificial Intelligence Organization (2018)

    Google Scholar 

  16. Chakraborti, T., Sreedharan, S., Zhang, Y., Kambhampati, S.: Plan Explanations as Model Reconciliation: Moving Beyond Explanation as Soliloquy. In Proc. 26th Int. Joint Conference on Artificial Intelligence, IJCAI 2017, pp. 156–163 (2017)

    Google Scholar 

  17. Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75538-8_7

    Chapter  Google Scholar 

  18. Feldman, J.A., Sproull, R.F.: Decision theory and artificial intelligence II: the hungry monkey. Cogn. Sci. 1(2), 158–192 (1977)

    Article  Google Scholar 

  19. Foka, A., Trahanias, P.: Real-time hierarchical POMDPs for autonomous robot navigation. Robot. Autonom. Syst. 55(7), 561–571 (2007)

    Article  Google Scholar 

  20. Fox, M., Long, D., Magazzeni, D.: Explainable Planning (2017). CoRR, abs/1709.10256

    Google Scholar 

  21. Gunning, D., Aha, D.: DARPA’s explainable artificial intelligence (XAI) program. AI Magazine 40(2), 44–58 (2019)

    Article  Google Scholar 

  22. Hauskrecht, M.: Value-function approximations for partially observable Markov Decision processes. JAIR 13, 33–94 (2000)

    Article  MathSciNet  Google Scholar 

  23. Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1–2), 99–134 (1998)

    Article  MathSciNet  Google Scholar 

  24. Khatib, O.: Real-time obstacle avoidance for manipulators and mobile robots. In: Proceedings of 1985 IEEE International Conference on Robotics and Automation, Vol.2, pp. 500–505 (1985)

    Google Scholar 

  25. Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_29

    Chapter  Google Scholar 

  26. Lee, J., Kim, G.-H., Poupart, P., Kim, K.-E.: Monte-Carlo tree search for constrained POMDPs. In: NIPS 2018, pp. 1–17 (2018)

    Google Scholar 

  27. Mahajan, D., Tan, C., Sharma, A.: Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers (2019)

    Google Scholar 

  28. Papadimitriou, C., Tsitsiklis, J.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)

    Article  MathSciNet  Google Scholar 

  29. Potena, C., Nardi, D., Pretto, A.: Joint vision-based navigation, control and obstacle avoidance for UAVs in dynamic environments. In: 2019 European Conference on Mobile Robots (ECMR), pp. 1–7 (2019)

    Google Scholar 

  30. Ross, S., Pineau, J., Paquet, S., Chaib-draa, B.: Online planning algorithms for POMDPs. JAIR 32, 663–704 (2008)

    Article  MathSciNet  Google Scholar 

  31. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Pearson Education, London (2003)

    MATH  Google Scholar 

  32. Silver, D., Huang, A., Maddison, C.J., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)

    Article  Google Scholar 

  33. Silver, D., Schrittwieser, J., Simonyan, K., et al.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017)

    Article  Google Scholar 

  34. Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. In: NIPS 2010, pp. 2164–2172 (2010)

    Google Scholar 

  35. Smith, D.E.: Planning as an iterative process. In: Proceedings of Twenty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2012, pp. 2180–2185. AAAI Press (2012)

    Google Scholar 

  36. Sreedharan, S., Chakraborti, T., Kambhampati, S.: Handling model uncertainty and multiplicity in explanations via model reconciliation. In: Proceedings of International Conference on Automated Planning and Scheduling, ICAPS 2018, pp. 518–526 (2018)

    Google Scholar 

  37. Steccanella, L., Bloisi, D.D., Castellini, A., Farinelli, A.: Waterline and obstacle detection in images from low-cost autonomous boats for environmental monitoring. Robot. Autonom. Syst. 124, 103346 (2020)

    Article  Google Scholar 

  38. Lorenz, U.: Leitbilder in der Künstlichen Intelligenz. Reinforcement Learning, pp. 161–170. Springer, Heidelberg (2020). https://doi.org/10.1007/978-3-662-61651-2_6

    Chapter  Google Scholar 

  39. Vanzo, A., Croce, D., Bastianelli, E., Basili, R., Nardi, D.: Grounded language interpretation of robotic commands through structured learning. Artif. Intell. 278, 103181 (2020)

    Article  MathSciNet  Google Scholar 

  40. Zhang, Y., Sreedharan, S., Kulkarni, A., Chakraborti, T., Zhuo, H.H., Kambhampati, S.: Plan explicability and predictability for robot task planning. In: IEEE International Conferencce on Robotics and Automation (ICRA 2017), pp. 1313–1320 (2017)

    Google Scholar 

Download references

Acknowledgments

The research has been partially supported by the projects “Dipartimenti di Eccellenza 2018–2022, funded by the Italian Ministry of Education, Universities and Research (MIUR), and “GHOTEM/CORE-WOOD, POR-FESR 2014–2020”, funded by Regione del Veneto.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alberto Castellini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Castellini, A., Marchesini, E., Mazzi, G., Farinelli, A. (2020). Explaining the Influence of Prior Knowledge on POMCP Policies. In: Bassiliades, N., Chalkiadakis, G., de Jonge, D. (eds) Multi-Agent Systems and Agreement Technologies. EUMAS AT 2020 2020. Lecture Notes in Computer Science(), vol 12520. Springer, Cham. https://doi.org/10.1007/978-3-030-66412-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-66412-1_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-66411-4

  • Online ISBN: 978-3-030-66412-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics