Abstract
In recent years, advances in deep learning have resulted in a plethora of successes in the use of reinforcement learning (RL) to solve complex sequential decision tasks with high-dimensional inputs. However, existing systems lack the necessary mechanisms to provide humans with a holistic view of their competence, presenting an impediment to their adoption, particularly in critical applications where the decisions an agent makes can have significant consequences. Yet, existing RL-based systems are essentially competency-unaware in that they lack the necessary interpretation mechanisms to allow human operators to have an insightful, holistic view of their competency. Towards more explainable Deep RL (xDRL), we propose a new framework based on analyses of interestingness. Our tool provides various measures of RL agent competence stemming from interestingness analysis and is applicable to a wide range of RL algorithms, natively supporting the popular RLLib toolkit. We showcase the use of our framework by applying the proposed pipeline in a set of scenarios of varying complexity. We empirically assess the capability of the approach in identifying agent behavior patterns and competency-controlling conditions, and the task elements mostly responsible for an agent’s competence, based on global and local analyses of interestingness. Overall, we show that our framework can provide agent designers with insights about RL agent competence, both their capabilities and limitations, enabling more informed decisions about interventions, additional training, and other interactions in collaborative human-machine settings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The IxDRL toolkit code is available at: https://github.com/SRI-AIC/ixdrl.
- 2.
Without loss of generality, here we deal with episodic tasks.
- 3.
This corresponds to using the backward finite difference coefficient with accuracy 2 [10]. A higher-order accuracy could be used if we wish to capture how the value function is changing for the computation of Goal Conduciveness, by using information from timesteps further back in the trace.
- 4.
This quantity is also known as the one-step TD or TD(0) target.
- 5.
Our framework also computes stochasticity from models parameterizing continuous distributions, using an appropriate coefficient of variation in place of Leik’s D.
- 6.
Our implementation also computes familiarity from an ensemble of predictive models parameterizing distributions instead of outputting point predictions, in which case we use divergence measures between prediction distributions to replace for d.
- 7.
All configurations used to train the RL agents, as well the data for each scenario, are available at: https://github.com/SRI-AIC/23-xai-ixdrl-data.
- 8.
- 9.
- 10.
We used the implementation at: https://github.com/JannerM/mbpo.
- 11.
A more detailed description of our SC2 task is provided in [40].
- 12.
For traces with similar length, alternative methods such as Dynamic Time Warping (DTW) [37] could be used to align and compute the distances between traces.
- 13.
Because these dimensions rely on information from multiple timesteps, a more robust model, making use of past information, is likely required to provide good predictions.
References
Amir, O., Doshi-Velez, F., Sarne, D.: Summarizing agent strategies. Auton. Agent. Multi-Agent Syst. 33(5), 628–644 (2019). https://doi.org/10.1007/s10458-019-09418-w
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013). https://doi.org/10.1613/jair.3912
Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 449–458. PMLR (2017). https://proceedings.mlr.press/v70/bellemare17a.html
Blizzard Entertainment: StarCraft II official game site (2022). https://starcraft2.com. Accessed 23 Aug 2022
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. KDD 2016, Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2939672.2939785
Chua, K., Calandra, R., McAllister, R., Levine, S.: Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018). https://proceedings.neurips.cc/paper/2018/file/3de568f8597b94bda53149c7d7f5958c-Paper.pdf
Dereszynski, E., Hostetler, J., Fern, A., Dietterich, T., Hoang, T.T., Udarbe, M.: Learning probabilistic behavior models in real-time strategy games. In: Seventh Artificial Intelligence and Interactive Digital Entertainment Conference (2011)
Espeholt, L., Marinier, R., Stanczyk, P., Wang, K., Michalski, M.: Seed rl: scalable and efficient deep-RL with accelerated central inference. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=rkgvXlrKwH
Espeholt, L., et al.: Impala: scalable distributed deep-RL with importance weighted actor-learner architectures. In: International Conference on Machine Learning, pp. 1407–1416. PMLR (2018)
Fornberg, B.: Generation of finite difference formulas on arbitrarily spaced grids. Math. Comput. (1988). https://doi.org/10.1090/S0025-5718-1988-0935077-0
Greydanus, S., Koul, A., Dodge, J., Fern, A.: Visualizing and understanding Atari agents. In: Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1792–1801. PMLR, Stockholmsmässan, Stockholm Sweden (2018)
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1861–1870. PMLR (2018). https://proceedings.mlr.press/v80/haarnoja18b.html
Hayes, B., Shah, J.A.: Improving robot controller transparency through autonomous policy explanation. In: 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 303–312 (2017)
Heuillet, A., Couthouis, F., Díaz-Rodríguez, N.: Explainability in deep reinforcement learning. Knowl. Based Syst. 214, 106685 (2021). https://doi.org/10.1016/j.knosys.2020.106685
Hoffman, R.R., Mueller, S.T., Klein, G., Litman, J.: Metrics for explainable AI: challenges and prospects (2018). https://doi.org/10.48550/ARXIV.1812.04608
Hostetler, J., Dereszynski, E., Dietterich, T., Fern, A.: Inferring strategies from limited reconnaissance in real-time strategy games. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2012)
Huang, S.H., Bhatia, K., Abbeel, P., Dragan, A.D.: Establishing appropriate trust via critical states. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3929–3936 (2018). https://doi.org/10.1109/IROS.2018.8593649
Huang, S.H., Held, D., Abbeel, P., Dragan, A.D.: Enabling robots to communicate their objectives. Auton. Robot. 43(2), 309–326 (2019). https://doi.org/10.1007/s10514-018-9771-0
Janner, M., Fu, J., Zhang, M., Levine, S.: When to trust your model: model-based policy optimization. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc. (2019). https://proceedings.neurips.cc/paper/2019/file/5faf461eff3099671ad63c6f3f094f7f-Paper.pdf
Kaufman, L., Rousseeuw, P.J.: Agglomerative nesting (program agnes). In: Finding Groups in Data: An Introduction to Cluster Analysis, pp. 199–252. Wiley (1990)
Kostal, L., Marsalek, P.: Neuronal jitter: can we measure the spike timing dispersion differently? Chin. J. Physiol. 53(6), 454–464 (2010). https://doi.org/10.4077/cjp.2010.amm031
Koul, A., Fern, A., Greydanus, S.: Learning finite state representations of recurrent policy networks. In: International Conference on Learning Representations. ICLR 2019 (2019). https://openreview.net/forum?id=S1gOpsCctm
Lage, I., Lifschitz, D., Doshi-Velez, F., Amir, O.: Exploring Computational User Models for Agent Policy Summarization. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, pp. 1401–1407. International Joint Conferences on Artificial Intelligence Organization, California (2019). https://doi.org/10.24963/ijcai.2019/194
Leik, R.K.: A measure of ordinal consensus. Pac. Sociol. Rev. 9(2), 85–90 (1966). https://doi.org/10.2307/1388242
Liang, E., et al.: RLlib: abstractions for distributed reinforcement learning. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 3053–3062. PMLR (2018). https://proceedings.mlr.press/v80/liang18b.html
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf
Madumal, P., Miller, T., Sonenberg, L., Vetere, F.: Explainable reinforcement learning through a causal lens. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 03, pp. 2493–2500 (2020). https://doi.org/10.1609/aaai.v34i03.5631
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Naeem, M., Rizvi, S.T.H., Coronato, A.: A gentle introduction to reinforcement learning and its application in different fields. IEEE Access 8, 209320–209344 (2020). https://doi.org/10.1109/ACCESS.2020.3038605
Olson, M.L., Khanna, R., Neal, L., Li, F., Wong, W.K.: Counterfactual state explanations for reinforcement learning agents via generative deep learning. Artif. Intel. 295, 103455 (2021). https://doi.org/10.1016/j.artint.2021.103455
Berner, C., et al.: Dota 2 with Large Scale Deep Reinforcement Learning (2019). https://doi.org/10.48550/arXiv.1912.06680
Pathak, D., Gandhi, D., Gupta, A.: Self-supervised exploration via disagreement. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 5062–5071. PMLR (2019). https://proceedings.mlr.press/v97/pathak19a.html
Pielou, E.: The measurement of diversity in different types of biological collections. J. Theor. Biol. 13, 131–144 (1966). https://doi.org/10.1016/0022-5193(66)90013-0
Puiutta, E., Veith, E.M.S.P.: Explainable reinforcement learning: a survey. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2020. LNCS, vol. 12279, pp. 77–95. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57321-8_5
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. Wiley, New York (1994)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987). https://doi.org/10.1016/0377-0427(87)90125-7
Salvador, S., Chan, P.: Toward accurate dynamic time warping in linear time and space. Intel. Data Anal. 11(5), 561–580 (2007). https://doi.org/10.3233/IDA-2007-11508
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay (2015). https://doi.org/10.48550/arxiv.1511.05952
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017). https://doi.org/10.48550/arxiv.1707.06347
Sequeira, P., Elenius, D., Hostetler, J., Gervasio, M.: A framework for understanding and visualizing strategies of RL agents (2022). https://doi.org/10.48550/arxiv.2208.08552
Sequeira, P., Gervasio, M.: Interestingness elements for explainable reinforcement learning: understanding agents’ capabilities and limitations. Artif. Intell. 288, 103367 (2020). https://doi.org/10.1016/j.artint.2020.103367
Sequeira, P., Yeh, E., Gervasio, M.: Interestingness elements for explainable reinforcement learning through introspection. In: Joint Proceedings of the ACM IUI 2019 Workshops, p. 7. ACM (2019)
Shyam, P., Jaśkowski, W., Gomez, F.: Model-based active exploration. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 5779–5788. PMLR (2019). https://ngs.mlr.press/v97/shyam19a.html
Silver, D., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018). https://doi.org/10.1126/science.aar6404
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning, 2nd edn. MIT Press, Cambridge (2018)
Todorov, E., Erez, T., Tassa, Y.: Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033. IEEE (2012). https://doi.org/10.1109/IROS.2012.6386109
Vinyals, O., et al.: Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019). https://doi.org/10.1038/s41586-019-1724-z
Vinyals, O., et al.: Starcraft II: a new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782 (2017)
van der Waa, J., van Diggelen, J., Bosch, K.V.D., Neerincx, M.: Contrastive explanations for reinforcement learning in terms of expected consequences. In: IJCAI Workshop on Explainable AI, vol. 37, no. 03 arXiv (2018). https://doi.org/10.48550/arxiv.1807.08706
Yeh, E., Sequeira, P., Hostetler, J., Gervasio, M.: Outcome-guided counterfactuals for reinforcement learning agents from a jointly trained generative latent space (2022). https://doi.org/10.48550/arxiv.2207.07710
Zahavy, T., Ben-Zrihem, N., Mannor, S.: Graying the black box: understanding DQNs. In: Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 1899–1908. PMLR, New York, New York, USA (2016)
Acknowledgements
This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001119C0112. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the DARPA.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sequeira, P., Gervasio, M. (2023). IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit Based on Analyses of Interestingness. In: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1901. Springer, Cham. https://doi.org/10.1007/978-3-031-44064-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-44064-9_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44063-2
Online ISBN: 978-3-031-44064-9
eBook Packages: Computer ScienceComputer Science (R0)