IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit Based on Analyses of Interestingness

Sequeira, Pedro; Gervasio, Melinda

doi:10.1007/978-3-031-44064-9_20

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1901))

Included in the following conference series:

World Conference on Explainable Artificial Intelligence

1287 Accesses

Abstract

In recent years, advances in deep learning have resulted in a plethora of successes in the use of reinforcement learning (RL) to solve complex sequential decision tasks with high-dimensional inputs. However, existing systems lack the necessary mechanisms to provide humans with a holistic view of their competence, presenting an impediment to their adoption, particularly in critical applications where the decisions an agent makes can have significant consequences. Yet, existing RL-based systems are essentially competency-unaware in that they lack the necessary interpretation mechanisms to allow human operators to have an insightful, holistic view of their competency. Towards more explainable Deep RL (xDRL), we propose a new framework based on analyses of interestingness. Our tool provides various measures of RL agent competence stemming from interestingness analysis and is applicable to a wide range of RL algorithms, natively supporting the popular RLLib toolkit. We showcase the use of our framework by applying the proposed pipeline in a set of scenarios of varying complexity. We empirically assess the capability of the approach in identifying agent behavior patterns and competency-controlling conditions, and the task elements mostly responsible for an agent’s competence, based on global and local analyses of interestingness. Overall, we show that our framework can provide agent designers with insights about RL agent competence, both their capabilities and limitations, enabling more informed decisions about interventions, additional training, and other interactions in collaborative human-machine settings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Explainable reinforcement learning (XRL): a systematic literature review and taxonomy

Article Open access 29 November 2023

Explaining Black Box Reinforcement Learning Agents Through Counterfactual Policies

Non-analytical Reasoning Assisted Deep Reinforcement Learning

Notes

1.
The IxDRL toolkit code is available at: https://github.com/SRI-AIC/ixdrl.
2.
Without loss of generality, here we deal with episodic tasks.
3.
This corresponds to using the backward finite difference coefficient with accuracy 2 [10]. A higher-order accuracy could be used if we wish to capture how the value function is changing for the computation of Goal Conduciveness, by using information from timesteps further back in the trace.
4.
This quantity is also known as the one-step TD or TD(0) target.
5.
Our framework also computes stochasticity from models parameterizing continuous distributions, using an appropriate coefficient of variation in place of Leik’s D.
6.
Our implementation also computes familiarity from an ensemble of predictive models parameterizing distributions instead of outputting point predictions, in which case we use divergence measures between prediction distributions to replace for d.
7.
All configurations used to train the RL agents, as well the data for each scenario, are available at: https://github.com/SRI-AIC/23-xai-ixdrl-data.
8.
https://gymnasium.farama.org/environments/atari/breakout/.
9.
https://gymnasium.farama.org/environments/mujoco/hopper/.
10.
We used the implementation at: https://github.com/JannerM/mbpo.
11.
A more detailed description of our SC2 task is provided in [40].
12.
For traces with similar length, alternative methods such as Dynamic Time Warping (DTW) [37] could be used to align and compute the distances between traces.
13.
Because these dimensions rely on information from multiple timesteps, a more robust model, making use of past information, is likely required to provide good predictions.

References

Amir, O., Doshi-Velez, F., Sarne, D.: Summarizing agent strategies. Auton. Agent. Multi-Agent Syst. 33(5), 628–644 (2019). https://doi.org/10.1007/s10458-019-09418-w
Article Google Scholar
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013). https://doi.org/10.1613/jair.3912
Article Google Scholar
Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 449–458. PMLR (2017). https://proceedings.mlr.press/v70/bellemare17a.html
Blizzard Entertainment: StarCraft II official game site (2022). https://starcraft2.com. Accessed 23 Aug 2022
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. KDD 2016, Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2939672.2939785
Chua, K., Calandra, R., McAllister, R., Levine, S.: Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018). https://proceedings.neurips.cc/paper/2018/file/3de568f8597b94bda53149c7d7f5958c-Paper.pdf
Dereszynski, E., Hostetler, J., Fern, A., Dietterich, T., Hoang, T.T., Udarbe, M.: Learning probabilistic behavior models in real-time strategy games. In: Seventh Artificial Intelligence and Interactive Digital Entertainment Conference (2011)
Google Scholar
Espeholt, L., Marinier, R., Stanczyk, P., Wang, K., Michalski, M.: Seed rl: scalable and efficient deep-RL with accelerated central inference. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=rkgvXlrKwH
Espeholt, L., et al.: Impala: scalable distributed deep-RL with importance weighted actor-learner architectures. In: International Conference on Machine Learning, pp. 1407–1416. PMLR (2018)
Google Scholar
Fornberg, B.: Generation of finite difference formulas on arbitrarily spaced grids. Math. Comput. (1988). https://doi.org/10.1090/S0025-5718-1988-0935077-0
Article MathSciNet MATH Google Scholar
Greydanus, S., Koul, A., Dodge, J., Fern, A.: Visualizing and understanding Atari agents. In: Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1792–1801. PMLR, Stockholmsmässan, Stockholm Sweden (2018)
Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1861–1870. PMLR (2018). https://proceedings.mlr.press/v80/haarnoja18b.html
Hayes, B., Shah, J.A.: Improving robot controller transparency through autonomous policy explanation. In: 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 303–312 (2017)
Google Scholar
Heuillet, A., Couthouis, F., Díaz-Rodríguez, N.: Explainability in deep reinforcement learning. Knowl. Based Syst. 214, 106685 (2021). https://doi.org/10.1016/j.knosys.2020.106685
Article Google Scholar
Hoffman, R.R., Mueller, S.T., Klein, G., Litman, J.: Metrics for explainable AI: challenges and prospects (2018). https://doi.org/10.48550/ARXIV.1812.04608
Hostetler, J., Dereszynski, E., Dietterich, T., Fern, A.: Inferring strategies from limited reconnaissance in real-time strategy games. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2012)
Google Scholar
Huang, S.H., Bhatia, K., Abbeel, P., Dragan, A.D.: Establishing appropriate trust via critical states. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3929–3936 (2018). https://doi.org/10.1109/IROS.2018.8593649
Huang, S.H., Held, D., Abbeel, P., Dragan, A.D.: Enabling robots to communicate their objectives. Auton. Robot. 43(2), 309–326 (2019). https://doi.org/10.1007/s10514-018-9771-0
Article Google Scholar
Janner, M., Fu, J., Zhang, M., Levine, S.: When to trust your model: model-based policy optimization. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc. (2019). https://proceedings.neurips.cc/paper/2019/file/5faf461eff3099671ad63c6f3f094f7f-Paper.pdf
Kaufman, L., Rousseeuw, P.J.: Agglomerative nesting (program agnes). In: Finding Groups in Data: An Introduction to Cluster Analysis, pp. 199–252. Wiley (1990)
Google Scholar
Kostal, L., Marsalek, P.: Neuronal jitter: can we measure the spike timing dispersion differently? Chin. J. Physiol. 53(6), 454–464 (2010). https://doi.org/10.4077/cjp.2010.amm031
Article Google Scholar
Koul, A., Fern, A., Greydanus, S.: Learning finite state representations of recurrent policy networks. In: International Conference on Learning Representations. ICLR 2019 (2019). https://openreview.net/forum?id=S1gOpsCctm
Lage, I., Lifschitz, D., Doshi-Velez, F., Amir, O.: Exploring Computational User Models for Agent Policy Summarization. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, pp. 1401–1407. International Joint Conferences on Artificial Intelligence Organization, California (2019). https://doi.org/10.24963/ijcai.2019/194
Leik, R.K.: A measure of ordinal consensus. Pac. Sociol. Rev. 9(2), 85–90 (1966). https://doi.org/10.2307/1388242
Article Google Scholar
Liang, E., et al.: RLlib: abstractions for distributed reinforcement learning. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 3053–3062. PMLR (2018). https://proceedings.mlr.press/v80/liang18b.html
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf
Madumal, P., Miller, T., Sonenberg, L., Vetere, F.: Explainable reinforcement learning through a causal lens. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 03, pp. 2493–2500 (2020). https://doi.org/10.1609/aaai.v34i03.5631
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Article Google Scholar
Naeem, M., Rizvi, S.T.H., Coronato, A.: A gentle introduction to reinforcement learning and its application in different fields. IEEE Access 8, 209320–209344 (2020). https://doi.org/10.1109/ACCESS.2020.3038605
Article Google Scholar
Olson, M.L., Khanna, R., Neal, L., Li, F., Wong, W.K.: Counterfactual state explanations for reinforcement learning agents via generative deep learning. Artif. Intel. 295, 103455 (2021). https://doi.org/10.1016/j.artint.2021.103455
Article MathSciNet MATH Google Scholar
Berner, C., et al.: Dota 2 with Large Scale Deep Reinforcement Learning (2019). https://doi.org/10.48550/arXiv.1912.06680
Pathak, D., Gandhi, D., Gupta, A.: Self-supervised exploration via disagreement. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 5062–5071. PMLR (2019). https://proceedings.mlr.press/v97/pathak19a.html
Pielou, E.: The measurement of diversity in different types of biological collections. J. Theor. Biol. 13, 131–144 (1966). https://doi.org/10.1016/0022-5193(66)90013-0
Article Google Scholar
Puiutta, E., Veith, E.M.S.P.: Explainable reinforcement learning: a survey. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2020. LNCS, vol. 12279, pp. 77–95. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57321-8_5
Chapter Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. Wiley, New York (1994)
Book MATH Google Scholar
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987). https://doi.org/10.1016/0377-0427(87)90125-7
Article MATH Google Scholar
Salvador, S., Chan, P.: Toward accurate dynamic time warping in linear time and space. Intel. Data Anal. 11(5), 561–580 (2007). https://doi.org/10.3233/IDA-2007-11508
Article Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay (2015). https://doi.org/10.48550/arxiv.1511.05952
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017). https://doi.org/10.48550/arxiv.1707.06347
Sequeira, P., Elenius, D., Hostetler, J., Gervasio, M.: A framework for understanding and visualizing strategies of RL agents (2022). https://doi.org/10.48550/arxiv.2208.08552
Sequeira, P., Gervasio, M.: Interestingness elements for explainable reinforcement learning: understanding agents’ capabilities and limitations. Artif. Intell. 288, 103367 (2020). https://doi.org/10.1016/j.artint.2020.103367
Article MathSciNet Google Scholar
Sequeira, P., Yeh, E., Gervasio, M.: Interestingness elements for explainable reinforcement learning through introspection. In: Joint Proceedings of the ACM IUI 2019 Workshops, p. 7. ACM (2019)
Google Scholar
Shyam, P., Jaśkowski, W., Gomez, F.: Model-based active exploration. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 5779–5788. PMLR (2019). https://ngs.mlr.press/v97/shyam19a.html
Silver, D., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018). https://doi.org/10.1126/science.aar6404
Article MathSciNet MATH Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning, 2nd edn. MIT Press, Cambridge (2018)
MATH Google Scholar
Todorov, E., Erez, T., Tassa, Y.: Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033. IEEE (2012). https://doi.org/10.1109/IROS.2012.6386109
Vinyals, O., et al.: Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019). https://doi.org/10.1038/s41586-019-1724-z
Article Google Scholar
Vinyals, O., et al.: Starcraft II: a new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782 (2017)
van der Waa, J., van Diggelen, J., Bosch, K.V.D., Neerincx, M.: Contrastive explanations for reinforcement learning in terms of expected consequences. In: IJCAI Workshop on Explainable AI, vol. 37, no. 03 arXiv (2018). https://doi.org/10.48550/arxiv.1807.08706
Yeh, E., Sequeira, P., Hostetler, J., Gervasio, M.: Outcome-guided counterfactuals for reinforcement learning agents from a jointly trained generative latent space (2022). https://doi.org/10.48550/arxiv.2207.07710
Zahavy, T., Ben-Zrihem, N., Mannor, S.: Graying the black box: understanding DQNs. In: Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 1899–1908. PMLR, New York, New York, USA (2016)
Google Scholar

Download references

Acknowledgements

This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001119C0112. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the DARPA.

Author information

Authors and Affiliations

SRI International, 333 Ravenswood Ave., Menlo Park, CA, 94025, USA
Pedro Sequeira & Melinda Gervasio

Authors

Pedro Sequeira
View author publications
You can also search for this author in PubMed Google Scholar
Melinda Gervasio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pedro Sequeira .

Editor information

Editors and Affiliations

Technological University Dublin, Dublin, Ireland
Luca Longo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sequeira, P., Gervasio, M. (2023). IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit Based on Analyses of Interestingness. In: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1901. Springer, Cham. https://doi.org/10.1007/978-3-031-44064-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-44064-9_20
Published: 30 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44063-2
Online ISBN: 978-3-031-44064-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit Based on Analyses of Interestingness