Skip to main content

Advertisement

Log in

Uncovering Strategies and Commitment Through Machine Learning System Introspection

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Deep neural networks are naturally “black boxes”, offering little insight into how or why they make decisions. These limitations diminish the adoption likelihood of such systems for important tasks and as trusted teammates. We design and employ an introspective method to abstract neural activation patterns into human-interpretable strategies and identify relationships between environmental conditions (why), strategies (how), and performance (result) on a deep reinforcement learning two-dimensional pursuit game application. For example, we found that activation patterns that were abstracted into “head-on” or “L-shaped” maneuver strategies were successful and intuitively corresponded to favorable initial conditions. Moreover, we characterize machine commitment by the introduction of a novel measure based on analysis of time-series neural activation patterns over the course of a game, and reveal significant correlations between machine commitment and performance. By uncovering temporally-dependent machine “thought processes” and commitment through introspection, we contribute to the larger explainable artificial intelligence initiative, increasing transparency and trust in machine learning systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The authors confirm that the data supporting the findings of this research are available within the article and its supplementary materials.

References

  1. Barredo Arrieta A, et al. Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion. 2020;58:82–115.

    Article  Google Scholar 

  2. Murdoch WJ, Singh C, Kumbier K, Abbasi-Asl R, Yu B. Definitions, methods, and applications in interpretable machine learning. Proc Natl Acad Sci U S A. 2019;116(44):22071–80. https://doi.org/10.1073/pnas.1900654116.

    Article  MathSciNet  MATH  Google Scholar 

  3. E. Schmidt et al., “Final Report, Chapter 7: Establishing Justified Confidence in AI Systems,” 2021.

  4. Google, “Explainable AI,” 2021. https://cloud.google.com/explainable-ai.

  5. I.B.M., “Explainable AI,” 2021. https://www.ibm.com/watson/explainable-ai.

  6. Krishnamurthy P, Khorrami F, Schmidt S, Wright K. Machine learning for NetFlow anomaly detection with human-readable annotations. IEEE Trans Netw Serv Manag. 2021;18(2):1885–98. https://doi.org/10.1109/TNSM.2021.3075656.

    Article  Google Scholar 

  7. Schmidt S, Stankowicz J, Carmack J, Kuzdeba S. RiftNeXt(TM): Explainable Deepn Neural RF Scene Classification. 2021.

  8. Sundararajan M, Najmi A. The many shapley values for model explanation. 37th Int Conf Mach Learn ICML. 2020;16814:9210–20.

    Google Scholar 

  9. Hilton J, Cammarata N, Carter S, Goh G, Olah C. Understanding RL vision. Distill. 2020. https://doi.org/10.23915/distill.00029.

    Article  Google Scholar 

  10. Schubert L, Petrov M, Carter S, Cammarata N, Goh G, Olah C. “OpenAI Microscope,” openai.com, 2020. https://microscope.openai.com/about?models.technique=deep_dream.

  11. Booth S, Zhou Y, Shah A, Shah J. Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by Example. 2020, [Online]. Available: http://arxiv.org/abs/2002.10248.

  12. Vilone G, Longo L. Explainable Artificial Intelligence: a Systematic Review. Prepr. ArXiv, 2020, [Online]. Available: http://arxiv.org/abs/2006.00093.

  13. Bäuerle A, Jönsson D, Ropinski T. Neural Activation Patterns (NAPs): Visual Explainability of Learned Concepts,” 2022, [Online]. Available: http://arxiv.org/abs/2206.10611.

  14. Zahavy T, Ben Zrihem N, Mannor S. Graying the black box: Understanding DQNs. in 33rd International Conference on Machine Learning, ICML 2016; 4: 2809–2822.

  15. Rauber PE, Fadel SG, Falcão AX, Telea AC. Visualizing the hidden activity of artificial neural networks. IEEE Trans Vis Comput Graph. 2017;23(1):101–10. https://doi.org/10.1109/TVCG.2016.2598838.

    Article  Google Scholar 

  16. Jaderberg M, et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science. 2019;364:859–65. https://doi.org/10.1126/science.aau6249.

    Article  MathSciNet  Google Scholar 

  17. Ali M, Jones ME, Xie X, Williams M. TimeCluster: dimension reduction applied to temporal data for visual analytics. Vis Comput. 2019;35:1013–26.

    Article  Google Scholar 

  18. McInnes L, Healy J, Melville J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. Prepr. ArXiv, 2018, [Online]. Available: http://arxiv.org/abs/1802.03426.

  19. Timothy L, Jonathan H, Alexander P, Nicolas H, Tom E, Yuval T, David S, Daan W (2015) Continuous control with deep reinforcement learning. CoRR

  20. Allen J, Schmidt S, Gabriel SA. Reinforcement learning approach to speed-overmatched pursuit games with uncertain target information. Mil Oper Res Soc J. 2022;27:37–50.

    Google Scholar 

  21. “Defense Advanced Research Projects Agency (DARPA),” 2019. https://www.darpa.mil/program/competency-aware-machine-learning.

  22. Klyubin AS, Polani D, Nehaniv CL. All else being equal be empowered. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). Berlin: Springer Berlin Heidelberg; 2005. p. 744–53.

    Google Scholar 

  23. Jung T, Polani D, Stone P. Empowerment for continuous agent-environment systems. Adapt Behav. 2011;19(1):16–39. https://doi.org/10.1177/1059712310392389.

    Article  Google Scholar 

  24. Klyubin AS, Polani D, Nehaniv CL. Keep your options open: an information-based driving principle for sensorimotor systems. PLoS ONE. 2008;3(12):4018. https://doi.org/10.1371/journal.pone.0004018.

    Article  Google Scholar 

  25. Pathak D, Agrawal P, Efros A, Darrell T (2017) Curiosity-driven Exploration by Self-supervised Prediction. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 2017, pp. 488–489. https://doi.org/10.1109/CVPRW.2017.70

  26. Dey S, Huang KW, Beerel PA, Chugg KM (2018) Characterizing sparse connectivity patterns in neural networks. In: 2018 Information Theory and Applications Workshop (ITA), San Diego, CA, USA, 2018, pp. 1–9. https://doi.org/10.1109/ITA.2018.8502950.

  27. Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27(3):379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.

    Article  MathSciNet  MATH  Google Scholar 

  28. Book TM, Thomas JA. Elements of information theory. Hoboken: Wiley; 1991.

    Google Scholar 

  29. Kruskal WH. Historical notes on the wilcoxon unpaired two-sample test. J Am Stat Assoc. 1957;52(279):356–60. https://doi.org/10.1080/01621459.1957.10501395.

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001119S0030. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA. The authors extend gratitude to BAE Systems FAST Labs™ for supporting their publication.

Funding

This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001119S0030.

Author information

Authors and Affiliations

Authors

Contributions

JA conceptualized and designed the methods. JA and SS implemented methods and created visualizations. JA analyzed results and wrote the main manuscript text. JA performed project administration and secured funding. SG served as advisor. SG and SS made contributions to text via review and revision. All authors reviewed the manuscript.

Corresponding author

Correspondence to Julia Filiberti Allen.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Ethical Approval

All principles of ethical and professional conduct have been followed. No human and/or animal research was conducted as part of this submission.

Consent for Publication

BAE Systems has approved this research for public release, unlimited distribution. Not export controlled per ES-FL-051121-0060.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material. 

42979_2023_1747_MOESM1_ESM.zip

Supplementary file1 Data and the trained DDPG models (Actor_ddpg, Critic_ddpg, TargetActor_ddpg, and TargetCritic_ddpg) are available in a separate.zip file (“Supplementary Data.zip”). The data needed to plot the paths for each game is contained in experimentpositions.csv, where each row corresponds to the time step, pursuer position and velocity, target position and velocity, and target maximum speed. The data associated with the game outcome and conditions are available in experimentgamestats.csv, where each row corresponds to the initial distance, initial angle to the target, maximum target speed, and the game outcome (-1 for loss for maximum distance, 0 for loss for maximum time, and 1 for successful capture (win)). The cluster assignments for each game are provided in k_means_clusters.csv. Additionally, pickled files of the actions (all_actions.p); conditions: initial angles to target (all_angles.p), initial distances to target (all_dists.p) and target maximum speeds (all_speeds.p); behaviors (all_embeddings.p); and game outcomes (all_results.p) are uploaded for convenience. (ZIP 903383 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Allen, J.F., Schmidt, S. & Gabriel, S.A. Uncovering Strategies and Commitment Through Machine Learning System Introspection. SN COMPUT. SCI. 4, 322 (2023). https://doi.org/10.1007/s42979-023-01747-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-01747-8

Keywords

Navigation