skip to main content
10.1145/3503161.3548260acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

MMDV: Interpreting DNNs via Building Evaluation Metrics, Manual Manipulation and Decision Visualization

Published:10 October 2022Publication History

ABSTRACT

The unexplainability and untrustworthiness of deep neural networks hinder their application in various high-risk fields. The existing methods lack solid evaluation metrics, interpretable models, and controllable manual manipulation. This paper presents Manual Manipulation and Decision Visualization (MMDV) which makes Human-in-the-loop improve the interpretability of deep neural networks. The MMDV offers three unique benefits: 1) The Expert-drawn CAM (Draw CAM) is presented to manipulate the key feature map and update the convolutional layer parameters, which makes the model focus on and learn the important parts by making a mask of the input image from the CAM drawn by the expert; 2) A hierarchical learning structure with sequential decision trees is proposed to provide a decision path and give strong interpretability for the fully connected layer of DNNs; 3) A novel metric, Data-Model-Result interpretable evaluation(DMR metric), is proposed to assess the interpretability of data, model and the results. Comprehensive experiments are conducted on the pre-trained models and public datasets. The results of the DMR metric are 0.4943, 0.5280, 0.5445 and 0.5108. These data quantifications represent the interpretability of the model and results. The attention force ratio is about 6.5% higher than the state-of-the-art methods. The Average Drop rate achieves 26.2% and the Average Increase rate achieves 36.6%. We observed that MMDV is better than other explainable methods by attention force ratio under the positioning evaluation. Furthermore, the manual manipulation disturbance experiments show that MMDV correctly locates the most responsive region in the target item and explains the model's internal decision-making basis. The MMDV not only achieves easily understandable interpretability but also makes it possible for people to be in the loop.

Skip Supplemental Material Section

Supplemental Material

MM22-ff2171.mp4

mp4

20.8 MB

References

  1. David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6541--6549.Google ScholarGoogle ScholarCross RefCross Ref
  2. David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, and Antonio Torralba. 2020. Understanding the role of individual units in a deep neural network. Proceedings of the National Academy of Sciences 117, 48 (2020), 30071--30078.Google ScholarGoogle ScholarCross RefCross Ref
  3. Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal. 2019. Reconciling modern machine-learning practice and the classical bias--variance trade-off. Proceedings of the National Academy of Sciences 116, 32 (2019), 15849--15854.Google ScholarGoogle ScholarCross RefCross Ref
  4. Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, and Vineeth N Balasubramanian. 2018. Grad-cam: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 839--847.Google ScholarGoogle Scholar
  5. Fenglei Fan, Mengzhou Li, Yueyang Teng, and Ge Wang. 2020. Soft Autoencoder and Its Wavelet Adaptation Interpretation. IEEE Transactions on Computational Imaging 6 (2020), 1245--1257.Google ScholarGoogle ScholarCross RefCross Ref
  6. Fenglei Fan and GeWang. 2020. Fuzzy logic interpretation of quadratic networks. Neurocomputing 374 (2020), 10--21.Google ScholarGoogle ScholarCross RefCross Ref
  7. Qi Fan, Wei Zhuo, Chi-Keung Tang, and Yu-Wing Tai. 2020. Few-shot object detection with attention-RPN and multi-relation detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4013--4022.Google ScholarGoogle ScholarCross RefCross Ref
  8. Ruth Fong, Mandela Patrick, and Andrea Vedaldi. 2019. Understanding deep networks via extremal perturbations and smooth masks. In Proceedings of the IEEE/CVF international conference on computer vision. 2950--2958.Google ScholarGoogle ScholarCross RefCross Ref
  9. Anna Lisa Gentile, Daniel Gruhl, Petar Ristoski, and Steve Welch. 2019. Explore and exploit. Dictionary expansion with human-in-the-loop. In European Semantic Web Conference. Springer, 131--145.Google ScholarGoogle ScholarCross RefCross Ref
  10. Branka Hadji Misheva, Ali Hirsa, Joerg Osterrieder, Onkar Kulkarni, and Stephen Fung Lin. 2021. Explainable AI in Credit Risk Management. Credit Risk Management (March 1, 2021) (2021).Google ScholarGoogle Scholar
  11. Peter Hase and Mohit Bansal. 2020. Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  12. Chengyue Jiang, Yinggong Zhao, Shanbo Chu, Libin Shen, and Kewei Tu. 2020. Cold-start and interpretability: Turning regular expressions into trainable recurrent neural networks. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 3193--3207.Google ScholarGoogle ScholarCross RefCross Ref
  13. Young Jae Kim, Jang Pyo Bae, Jun-Won Chung, Dong Kyun Park, Kwang Gi Kim, and Yoon Jae Kim. 2021. New polyp image classification technique using transfer learning of network-in-network structure in endoscopic images. Scientific Reports 11, 1 (2021), 1--8.Google ScholarGoogle Scholar
  14. Kacper Jacek Kubara, Blazej Manczak, Blazej Dolicki, and Kacper Sawicz. 2021. Towards Transparent and Explainable Attention Models, ML Reproducibility Challenge 2020. (2021).Google ScholarGoogle Scholar
  15. Isaac Lage, Andrew Slavin Ross, Been Kim, Samuel J Gershman, and Finale Doshi-Velez. 2018. Human-in-the-loop interpretability prior. Advances in neural information processing systems 31 (2018).Google ScholarGoogle Scholar
  16. Antonio Loquercio, Mattia Segu, and Davide Scaramuzza. 2020. A general framework for uncertainty estimation in deep learning. IEEE Robotics and Automation Letters 5, 2 (2020), 3153--3160.Google ScholarGoogle ScholarCross RefCross Ref
  17. Koki Madono, Teppei Nakano, Tetsunori Kobayashi, and Tetsuji Ogawa. 2020. Efficient Human-In-The-Loop Object Detection using Bi-Directional Deep SORT and Annotation-Free Segment Identification. In 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 1226--1233.Google ScholarGoogle Scholar
  18. Song Mei and Andrea Montanari. 2019. The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve. Communications on Pure and Applied Mathematics (2019).Google ScholarGoogle Scholar
  19. Meike Nauta, Ron van Bree, and Christin Seifert. 2021. Neural Prototype Trees for Interpretable Fine-grained Image Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14933--14943.Google ScholarGoogle ScholarCross RefCross Ref
  20. Vitali Petsiuk, Abir Das, and Kate Saenko. 2018. RISE: Randomized Input Sampling for Explanation of Black-box Models. In Proceedings of the British Machine Vision Conference (BMVC).Google ScholarGoogle Scholar
  21. Jessica Zeitz Self, Radha Krishnan Vinayagam, James Thomas Fry, and Chris North. 2016. Bridging the gap between user intention and model parameters for human-in-the-loop data analytics. In Proceedings of the Workshop on Human-Inthe- Loop Data Analytics. 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618--626.Google ScholarGoogle ScholarCross RefCross Ref
  23. Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo Wang, and Alan Loddon Yuille. 2019. Deep differentiable random forests for age estimation. IEEE transactions on pattern analysis and machine intelligence (2019).Google ScholarGoogle Scholar
  24. Suraj Srinivas and François Fleuret. 2019. Full-gradient representation for neural network visualization. Advances in neural information processing systems 32 (2019).Google ScholarGoogle Scholar
  25. Anant Subramanian, Danish Pruthi, Harsh Jhamtani, Taylor Berg-Kirkpatrick, and Eduard Hovy. 2018. Spine: Sparse interpretable neural embeddings. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  26. Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In International conference on machine learning. PMLR, 3319--3328.Google ScholarGoogle Scholar
  27. Neus Llop Torrent, Giorgio Visani, and Enrico Bagli. 2020. PSD2 Explainable AI Model for Credit Scoring. arXiv preprint arXiv:2011.10367 (2020).Google ScholarGoogle Scholar
  28. Alvin Wan, Lisa Dunlap, Daniel Ho, Jihan Yin, Scott Lee, Henry Jin, Suzanne Petryk, Sarah Adel Bargal, and Joseph E Gonzalez. 2021. NBDT: Neural-backed decision trees. (2021).Google ScholarGoogle Scholar
  29. Alvin Wan, Daniel Ho, Younjin Song, Henk Tillman, Sarah Adel Bargal, and Joseph E Gonzalez. 2020. SegNBDT: Visual Decision Rules for Segmentation. arXiv preprint arXiv:2006.06868 (2020).Google ScholarGoogle Scholar
  30. Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, and Xia Hu. 2020. Score-CAM: Score-weighted visual explanations for convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 24--25.Google ScholarGoogle ScholarCross RefCross Ref
  31. Tong Wang. 2019. Gaining free or low-cost interpretability with interpretable partial substitute. In International Conference on Machine Learning. PMLR, 6505--6514.Google ScholarGoogle Scholar
  32. Yulong Wang, Hang Su, Bo Zhang, and Xiaolin Hu. 2018. Interpret neural networks by identifying critical data routing paths. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8906--8914.Google ScholarGoogle ScholarCross RefCross Ref
  33. NicolaiWojke, Alex Bewley, and Dietrich Paulus. 2017. Simple online and realtime tracking with a deep association metric. In 2017 IEEE international conference on image processing (ICIP). IEEE, 3645--3649.Google ScholarGoogle Scholar
  34. Mike Wu, Sonali Parbhoo, Michael Hughes, Ryan Kindle, Leo Celi, Maurizio Zazzi, Volker Roth, and Finale Doshi-Velez. 2020. Regional tree regularization for interpretability in deep neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 6413--6421.Google ScholarGoogle ScholarCross RefCross Ref
  35. Ashima Yadav and Dinesh Kumar Vishwakarma. 2020. Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review 53, 6 (2020), 4335--4385.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Angela Yao, Juergen Gall, Christian Leistner, and Luc Van Gool. 2012. Interactive object detection. In 2012 IEEE conference on computer vision and pattern recognition. IEEE, 3242--3249.Google ScholarGoogle ScholarCross RefCross Ref
  37. Jiaxuan You, Jure Leskovec, Kaiming He, and Saining Xie. 2020. Graph structure of neural networks. In International Conference on Machine Learning. PMLR, 10881--10891.Google ScholarGoogle Scholar
  38. Shujian Yu and Jose C Principe. 2019. Understanding autoencoders with information theoretic concepts. Neural Networks 117 (2019), 104--123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Quanshi Zhang, Ruiming Cao, Feng Shi, Ying Nian Wu, and Song-Chun Zhu. 2018. Interpreting cnn knowledge via an explanatory graph. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  40. Quanshi Zhang, Ying Nian Wu, and Song-Chun Zhu. 2018. Interpretable convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8827--8836.Google ScholarGoogle ScholarCross RefCross Ref
  41. Quanshi Zhang, Yu Yang, Haotian Ma, and Ying Nian Wu. 2019. Interpreting cnns via decision trees. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6261--6270.Google ScholarGoogle ScholarCross RefCross Ref
  42. Bolei Zhou, David Bau, Aude Oliva, and Antonio Torralba. 2018. Interpreting deep visual representations via network dissection. IEEE transactions on pattern analysis and machine intelligence 41, 9 (2018), 2131--2145.Google ScholarGoogle Scholar
  43. Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2921--2929.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. MMDV: Interpreting DNNs via Building Evaluation Metrics, Manual Manipulation and Decision Visualization

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '22: Proceedings of the 30th ACM International Conference on Multimedia
      October 2022
      7537 pages
      ISBN:9781450392037
      DOI:10.1145/3503161

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 October 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia
    • Article Metrics

      • Downloads (Last 12 months)48
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader