research-article

MMDV: Interpreting DNNs via Building Evaluation Metrics, Manual Manipulation and Decision Visualization

Authors:
Keyang Cheng

Jiangsu University, Zhenjiang, China

Jiangsu University, Zhenjiang, China
View Profile

,
Yu Si

Jiangsu University, Zhenjiang, China

Jiangsu University, Zhenjiang, China
View Profile

,
Hao Zhou

Jiangsu University, Zhenjiang, China

Jiangsu University, Zhenjiang, China
View Profile

,
Rabia Tahir

Jiangsu University, Zhenjiang, China

Jiangsu University, Zhenjiang, China
View Profile

MM '22: Proceedings of the 30th ACM International Conference on MultimediaOctober 2022Pages 6627–6635https://doi.org/10.1145/3503161.3548260

Published:10 October 2022Publication History

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 6627–6635

ABSTRACT

The unexplainability and untrustworthiness of deep neural networks hinder their application in various high-risk fields. The existing methods lack solid evaluation metrics, interpretable models, and controllable manual manipulation. This paper presents Manual Manipulation and Decision Visualization (MMDV) which makes Human-in-the-loop improve the interpretability of deep neural networks. The MMDV offers three unique benefits: 1) The Expert-drawn CAM (Draw CAM) is presented to manipulate the key feature map and update the convolutional layer parameters, which makes the model focus on and learn the important parts by making a mask of the input image from the CAM drawn by the expert; 2) A hierarchical learning structure with sequential decision trees is proposed to provide a decision path and give strong interpretability for the fully connected layer of DNNs; 3) A novel metric, Data-Model-Result interpretable evaluation(DMR metric), is proposed to assess the interpretability of data, model and the results. Comprehensive experiments are conducted on the pre-trained models and public datasets. The results of the DMR metric are 0.4943, 0.5280, 0.5445 and 0.5108. These data quantifications represent the interpretability of the model and results. The attention force ratio is about 6.5% higher than the state-of-the-art methods. The Average Drop rate achieves 26.2% and the Average Increase rate achieves 36.6%. We observed that MMDV is better than other explainable methods by attention force ratio under the positioning evaluation. Furthermore, the manual manipulation disturbance experiments show that MMDV correctly locates the most responsive region in the target item and explains the model's internal decision-making basis. The MMDV not only achieves easily understandable interpretability but also makes it possible for people to be in the loop.

Supplemental Material

MM22-ff2171.mp4

mp4

20.8 MB

Download

References

David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6541--6549.Google ScholarCross Ref
David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, and Antonio Torralba. 2020. Understanding the role of individual units in a deep neural network. Proceedings of the National Academy of Sciences 117, 48 (2020), 30071--30078.Google ScholarCross Ref
Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal. 2019. Reconciling modern machine-learning practice and the classical bias--variance trade-off. Proceedings of the National Academy of Sciences 116, 32 (2019), 15849--15854.Google ScholarCross Ref
Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, and Vineeth N Balasubramanian. 2018. Grad-cam: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 839--847.Google Scholar
Fenglei Fan, Mengzhou Li, Yueyang Teng, and Ge Wang. 2020. Soft Autoencoder and Its Wavelet Adaptation Interpretation. IEEE Transactions on Computational Imaging 6 (2020), 1245--1257.Google ScholarCross Ref
Fenglei Fan and GeWang. 2020. Fuzzy logic interpretation of quadratic networks. Neurocomputing 374 (2020), 10--21.Google ScholarCross Ref
Qi Fan, Wei Zhuo, Chi-Keung Tang, and Yu-Wing Tai. 2020. Few-shot object detection with attention-RPN and multi-relation detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4013--4022.Google ScholarCross Ref
Ruth Fong, Mandela Patrick, and Andrea Vedaldi. 2019. Understanding deep networks via extremal perturbations and smooth masks. In Proceedings of the IEEE/CVF international conference on computer vision. 2950--2958.Google ScholarCross Ref
Anna Lisa Gentile, Daniel Gruhl, Petar Ristoski, and Steve Welch. 2019. Explore and exploit. Dictionary expansion with human-in-the-loop. In European Semantic Web Conference. Springer, 131--145.Google ScholarCross Ref
Branka Hadji Misheva, Ali Hirsa, Joerg Osterrieder, Onkar Kulkarni, and Stephen Fung Lin. 2021. Explainable AI in Credit Risk Management. Credit Risk Management (March 1, 2021) (2021).Google Scholar
Peter Hase and Mohit Bansal. 2020. Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
Chengyue Jiang, Yinggong Zhao, Shanbo Chu, Libin Shen, and Kewei Tu. 2020. Cold-start and interpretability: Turning regular expressions into trainable recurrent neural networks. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 3193--3207.Google ScholarCross Ref
Young Jae Kim, Jang Pyo Bae, Jun-Won Chung, Dong Kyun Park, Kwang Gi Kim, and Yoon Jae Kim. 2021. New polyp image classification technique using transfer learning of network-in-network structure in endoscopic images. Scientific Reports 11, 1 (2021), 1--8.Google Scholar
Kacper Jacek Kubara, Blazej Manczak, Blazej Dolicki, and Kacper Sawicz. 2021. Towards Transparent and Explainable Attention Models, ML Reproducibility Challenge 2020. (2021).Google Scholar
Isaac Lage, Andrew Slavin Ross, Been Kim, Samuel J Gershman, and Finale Doshi-Velez. 2018. Human-in-the-loop interpretability prior. Advances in neural information processing systems 31 (2018).Google Scholar
Antonio Loquercio, Mattia Segu, and Davide Scaramuzza. 2020. A general framework for uncertainty estimation in deep learning. IEEE Robotics and Automation Letters 5, 2 (2020), 3153--3160.Google ScholarCross Ref
Koki Madono, Teppei Nakano, Tetsunori Kobayashi, and Tetsuji Ogawa. 2020. Efficient Human-In-The-Loop Object Detection using Bi-Directional Deep SORT and Annotation-Free Segment Identification. In 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 1226--1233.Google Scholar
Song Mei and Andrea Montanari. 2019. The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve. Communications on Pure and Applied Mathematics (2019).Google Scholar
Meike Nauta, Ron van Bree, and Christin Seifert. 2021. Neural Prototype Trees for Interpretable Fine-grained Image Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14933--14943.Google ScholarCross Ref
Vitali Petsiuk, Abir Das, and Kate Saenko. 2018. RISE: Randomized Input Sampling for Explanation of Black-box Models. In Proceedings of the British Machine Vision Conference (BMVC).Google Scholar
Jessica Zeitz Self, Radha Krishnan Vinayagam, James Thomas Fry, and Chris North. 2016. Bridging the gap between user intention and model parameters for human-in-the-loop data analytics. In Proceedings of the Workshop on Human-Inthe- Loop Data Analytics. 1--6.Google ScholarDigital Library
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618--626.Google ScholarCross Ref
Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo Wang, and Alan Loddon Yuille. 2019. Deep differentiable random forests for age estimation. IEEE transactions on pattern analysis and machine intelligence (2019).Google Scholar
Suraj Srinivas and François Fleuret. 2019. Full-gradient representation for neural network visualization. Advances in neural information processing systems 32 (2019).Google Scholar
Anant Subramanian, Danish Pruthi, Harsh Jhamtani, Taylor Berg-Kirkpatrick, and Eduard Hovy. 2018. Spine: Sparse interpretable neural embeddings. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In International conference on machine learning. PMLR, 3319--3328.Google Scholar
Neus Llop Torrent, Giorgio Visani, and Enrico Bagli. 2020. PSD2 Explainable AI Model for Credit Scoring. arXiv preprint arXiv:2011.10367 (2020).Google Scholar
Alvin Wan, Lisa Dunlap, Daniel Ho, Jihan Yin, Scott Lee, Henry Jin, Suzanne Petryk, Sarah Adel Bargal, and Joseph E Gonzalez. 2021. NBDT: Neural-backed decision trees. (2021).Google Scholar
Alvin Wan, Daniel Ho, Younjin Song, Henk Tillman, Sarah Adel Bargal, and Joseph E Gonzalez. 2020. SegNBDT: Visual Decision Rules for Segmentation. arXiv preprint arXiv:2006.06868 (2020).Google Scholar
Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, and Xia Hu. 2020. Score-CAM: Score-weighted visual explanations for convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 24--25.Google ScholarCross Ref
Tong Wang. 2019. Gaining free or low-cost interpretability with interpretable partial substitute. In International Conference on Machine Learning. PMLR, 6505--6514.Google Scholar
Yulong Wang, Hang Su, Bo Zhang, and Xiaolin Hu. 2018. Interpret neural networks by identifying critical data routing paths. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8906--8914.Google ScholarCross Ref
NicolaiWojke, Alex Bewley, and Dietrich Paulus. 2017. Simple online and realtime tracking with a deep association metric. In 2017 IEEE international conference on image processing (ICIP). IEEE, 3645--3649.Google Scholar
Mike Wu, Sonali Parbhoo, Michael Hughes, Ryan Kindle, Leo Celi, Maurizio Zazzi, Volker Roth, and Finale Doshi-Velez. 2020. Regional tree regularization for interpretability in deep neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 6413--6421.Google ScholarCross Ref
Ashima Yadav and Dinesh Kumar Vishwakarma. 2020. Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review 53, 6 (2020), 4335--4385.Google ScholarDigital Library
Angela Yao, Juergen Gall, Christian Leistner, and Luc Van Gool. 2012. Interactive object detection. In 2012 IEEE conference on computer vision and pattern recognition. IEEE, 3242--3249.Google ScholarCross Ref
Jiaxuan You, Jure Leskovec, Kaiming He, and Saining Xie. 2020. Graph structure of neural networks. In International Conference on Machine Learning. PMLR, 10881--10891.Google Scholar
Shujian Yu and Jose C Principe. 2019. Understanding autoencoders with information theoretic concepts. Neural Networks 117 (2019), 104--123.Google ScholarDigital Library
Quanshi Zhang, Ruiming Cao, Feng Shi, Ying Nian Wu, and Song-Chun Zhu. 2018. Interpreting cnn knowledge via an explanatory graph. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Quanshi Zhang, Ying Nian Wu, and Song-Chun Zhu. 2018. Interpretable convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8827--8836.Google ScholarCross Ref
Quanshi Zhang, Yu Yang, Haotian Ma, and Ying Nian Wu. 2019. Interpreting cnns via decision trees. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6261--6270.Google ScholarCross Ref
Bolei Zhou, David Bau, Aude Oliva, and Antonio Torralba. 2018. Interpreting deep visual representations via network dissection. IEEE transactions on pattern analysis and machine intelligence 41, 9 (2018), 2131--2145.Google Scholar
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2921--2929.Google ScholarCross Ref

Index Terms

MMDV: Interpreting DNNs via Building Evaluation Metrics, Manual Manipulation and Decision Visualization
1. Networks
  1. Network performance evaluation
    1. Network performance modeling

Recommendations

Interpretability for reliable, efficient, and self-cognitive DNNs: From theories to applications
Abstract
In recent years, remarkable achievements have been made in artificial intelligence tasks and applications based on deep neural networks (DNNs), especially in the fields of vision, speech, text, and multimodal analysis. The learning of ...
Read More
Transparency of deep neural networks for medical image analysis: A review of interpretability methods
Abstract
Artificial Intelligence (AI) has emerged as a useful aid in numerous clinical applications for diagnosis and treatment decisions. Deep neural networks have shown the same or better performance than clinicians in many tasks owing to the ...
Highlights
- Interpretability of deep neural networks is important for fostering clinical trust and for troubleshooting systems.
Read More
Learning Locally Interpretable Rule Ensemble
Machine Learning and Knowledge Discovery in Databases: Research Track
Abstract
This paper proposes a new framework for learning a rule ensemble model that is both accurate and interpretable. A rule ensemble is an interpretable model based on the linear combination of weighted rules. In practice, we often face the trade-off ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
October 2022
7537 pages
ISBN:9781450392037
DOI:10.1145/3503161
General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep neural networks
explainability
human in the loop
interpretability
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 123
  Total Downloads
- Downloads (Last 12 months)48
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

MMDV: Interpreting DNNs via Building Evaluation Metrics, Manual Manipulation and Decision Visualization

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Interpretability for reliable, efficient, and self-cognitive DNNs: From theories to applications

Transparency of deep neural networks for medical image analysis: A review of interpretability methods

Learning Locally Interpretable Rule Ensemble