research-article

Mutual Information Preserving Back-propagation: Learn to Invert for Faithful Attribution

Authors:

Na Zou,

Xia HuAuthors Info & Claims

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Pages 258 - 268

https://doi.org/10.1145/3447548.3467310

Published: 14 August 2021 Publication History

Get Access

Abstract

Back-propagation based visualizations have been proposed to interpret deep neural networks (DNNs), some of which produce interpretations with good visual quality. However, there exist doubts about whether these intuitive visualizations are related to network decisions. Recent studies have confirmed this suspicion by verifying that almost all these modified back-propagation visualizations are not faithful to the model's decision-making process. Besides, these visualizations produce vague "relative importance scores", among which low values can't guarantee to be independent of the final prediction. Hence, it's highly desirable to develop a novel back-propagation method that guarantees theoretical faithfulness and produces a quantitative attribution score with a clear understanding. To achieve the goal, we resort to mutual information theory to generate the interpretations, studying how much information of output is encoded in each input neuron. The basic idea is to learn a source signal by back-propagation such that the mutual information between input and output should be as much as possible preserved in the mutual information between input and the source signal. In addition, we propose a Mutual Information Preserving Inverse Network, termed MIP-IN, in which the parameters of each layer are recursively trained to learn how to invert. During the inversion, forward relu operation is adopted to adapt the general interpretations to the specific input. We then empirically demonstrate that the inverted source signal satisfies completeness and minimality property, which are crucial for a faithful interpretation. Furthermore, the empirical study validates the effectiveness of interpretations generated by MIP-IN.

Supplementary Material

MP4 File (KDD_video_finalversion.mp4)

Presentation video for "Mutual Information Preserving Back-propagation: Learn to Invert for Faithful Attribution"

Download
111.10 MB

References

[1]

Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, and Been Kim. 2018. Sanity checks for saliency maps. In NeurIPS.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Back-propagation learning of infinite-dimensional dynamical systems

Elman's recurrent neural networks using resilient back propagation for harmonic detection

Mutual information neuro-evolutionary system (MINES)

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations