Enhancing Performance of Occlusion-Based Explanation Methods by a Hierarchical Search Method on Input Images

Behzadi-Khormouji, Hamed; Rostami, Habib

doi:10.1007/978-3-030-93736-2_9

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1524))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2273 Accesses

Abstract

In this work, we address some drawbacks of back-propagation-based and perturbation-based visualization methods by proposing an explanation method called Fast Multi-resolution Occlusion (FMO). FMO, opposite to the back-propagation-based methods that cannot be applied on all types of Convolutional Neural Networks (CNNs), can highlight the important input features independent of the architecture. Also, FMO introduces a novel fast occlusion strategy called multi-resolution occlusion which not only efficiently addresses the time-consumption issue of the traditional Occlusion Test method but also outperforms the well-known perturbation-based methods. We assess the methods on CNNs DenseNet121, InceptionV3, InceptionResnetV2, MobileNet, and ResNet50 using three datasets ILSVRC2012, PASCAL VOC07, and COCO14.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Behzadi-khormouji, H., et al.: Deep learning, reusable and problem-based architectures for detection of consolidation on chest X-ray images. Comput. Meth. Program. Biomed. 185, 105162 (2020). ISSN 0169-2607. https://doi.org/10.1016/j.cmpb.2019.105162
Gupta, A., Anpalagan, A., Guan, L., Khwaja, A.S.: Deep learning for object detection and scene perception in self-driving cars: survey, challenges, and open issues. Array 10, 100057 (2021). ISSN 2590-0056. https://doi.org/10.1016/j.array.2021.100057
Xiao, D., Yang, X., Li, J., Islam, M.: Attention deep neural network for lane marking detection. Knowl. Based Syst. 194, 105584 (2020). https://doi.org/10.1016/j.knosys.2020.105584
Article Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. arXiv arXiv:1512.04150 (2015)
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: 2nd International Conference on Learning Representations, ICLR 2014 (Workshop Track Proceedings) (2014)
Google Scholar
José Oramas, M., Wang, K., Tuytelaars, T.: Visual explanation by interpretation: improving visual feedback capabilities of deep neural networks. In: 7th International Conference on Learning Representations, ICLR 2019 (2019)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. arXiv arXiv:1610.02391 (2017)
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. arXiv arXiv:1704.02685 (2017)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. arXiv arXiv:1311.2901 (2014)
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?” Explaining the predictions of any classifier. arXiv arXiv:1602.04938 (2016)
Petsiuk, V., Das, A., Saenko, K.: RISE: randomized input sampling for explanation of black-box models, v1 (2018). http://arxiv.org/abs/1806.07421
Fong, R.C., Vedaldi, A.: Interpretable explanations of black boxes by meaningful perturbation. arXiv arXiv:1704.03296 (2018)
Fong, R., Patrick, M., Vedaldi, A.: Understanding deep networks via extremal perturbations and smooth masks. In: 2019 Proceedings of the IEEE International Conference on Computer Vision, pp. 2950–2958 (2019). https://doi.org/10.1109/ICCV.2019.00304
Behzadi-Khormouji, H., Rostami, H.: Fast multi-resolution occlusion: a method for explaining and understanding deep neural networks. Appl. Intell. 51(4), 2431–2455 (2020). https://doi.org/10.1007/s10489-020-01946-3
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2–9 (2009). https://doi.org/10.1109/CVPR.2009.5206848
Everingham, M., Ali Eslami, S.M., Van Gool, L., Williams, C.K.I., Winn, J.M., Zisserman, A.: The Pascal visual object classes challenge - a retrospective. Int. J. Comput. Vis. 111, 98–136 (2014)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Shakeel, M.S., Lam, K.M.: Deep-feature encoding-based discriminative model for age-invariant face recognition. Pattern Recogn. 93, 442–457 (2019). https://doi.org/10.1016/j.patcog.2019.04.028
Article Google Scholar
Szegedy, C., Vanhoucke, V., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. arXiv arXiv:1512.00567 (2015)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, Inception-ResNet and the impact of residual connections on learning. arXiv arXiv:1602.07261 (2016)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018). https://doi.org/10.1109/CVPR.2018.00474
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015). https://doi.org/10.1109/CVPR.2016.90
Zhang, J., Bargal, S.A., Lin, Z., Brandt, J., Shen, X., Sclaroff, S.: Top-down neural attention by excitation backprop. Int. J. Comput. Vis. 126(10), 1084–1102 (2017). https://doi.org/10.1007/s11263-017-1059-x
Article Google Scholar

Download references

Author information

Authors and Affiliations

imec-IDLab, University of Antwerp, Antwerpen, Belgium
Hamed Behzadi-Khormouji
Computer Engineering Department, Persian Gulf University, 75168, Bushehr, Iran
Hamed Behzadi-Khormouji & Habib Rostami

Authors

Hamed Behzadi-Khormouji
View author publications
You can also search for this author in PubMed Google Scholar
Habib Rostami
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamed Behzadi-Khormouji .

Editor information

Editors and Affiliations

IKIM, Ruhr-University Bochum, Bochum, Germany
Michael Kamp
University of Sydney, Sydney, NSW, Australia
Irena Koprinska
University of Namur, Namur, Belgium
Adrien Bibal
University of Rennes 1, Rennes, France
Tassadit Bouadi
University of Namur, Namur, Belgium
Benoît Frénay
Inria, Rennes, France
Luis Galárraga
University of Antwerp, Antwerp, Belgium
José Oramas
Ruhr University Bochum, Bochum, Germany
Linara Adilova
Royal Holloway University of London, Egham, UK
Yamuna Krishnamurthy
Ghent University, Ghent, Belgium
Bo Kang
Université Jean Monnet, Saint-Etienne cedex 2, France
Christine Largeron
Ghent University, Gent, Belgium
Jefrey Lijffijt
Telecom Paris, Paris, France
Tiphaine Viard
University of Bonn, Bonn, Germany
Pascal Welke
Norwegian Univesity of Science and Technology, Trondheim, Norway
Massimiliano Ruocco
BI Norwegian Business School, Oslo, Norway
Erlend Aune
University of Pisa, Pisa, Italy
Claudio Gallicchio
University of Duisburg-Essen, Essen, Germany
Gregor Schiele
Graz University of Technology, Graz, Austria
Franz Pernkopf
Xilinx Research, Dublin, Ireland
Michaela Blott
Heidelberg University, Heidelberg, Germany
Holger Fröning
Heidelberg University, Heidelberg, Germany
Günther Schindler
University of Pisa, Pisa, Italy
Riccardo Guidotti
University of Pisa, Pisa, Italy
Anna Monreale
ISTI-CNR, Pisa, Italy
Salvatore Rinzivillo
Warsaw University of Technology, Warsaw, Poland
Przemyslaw Biecek
Freie Universität Berlin, Berlin, Germany
Eirini Ntoutsi
Eindhoven University of Technology, Eindhoven, The Netherlands
Mykola Pechenizkiy
Leibniz University Hannover, Hannover, Germany
Bodo Rosenhahn
University of Sussex, Brighton, UK
Christopher Buckley
University of Chieti-Pescara, Chieti, Italy
Daniela Cialfi
Radboud University Nijmegen, Nijmegen, The Netherlands
Pablo Lanillos
McGill University, Montreal, Canada
Maxwell Ramstead
Ghent University, Ghent, Belgium
Tim Verbelen
University of Lisbon, Lisboa, Portugal
Pedro M. Ferreira
University of Bari Aldo Moro, Bari, Italy
Giuseppina Andresini
Universita di Bari Aldo Moro, Bari, Italy
Donato Malerba
University of Lisbon, Lisbon, Portugal
Ibéria Medeiros
Shenzhen University, Shenzhen, China
Philippe Fournier-Viger
Harbin Institute of Technology, Harbin, China
M. Saqib Nawaz
University of Córdoba, Córdoba, Spain
Sebastian Ventura
Peking University, Beijing, China
Meng Sun
Noah's Ark Lab, Huawei, Beijing, China
Min Zhou
UniCredit, Milan, Italy
Valerio Bitetta
UniCredit, Rome, Italy
Ilaria Bordino
UniCredit, Milan, Italy
Andrea Ferretti
Unicredit, Rome, Italy
Francesco Gullo
ENEA Headquarters, Portici, Italy
Giovanni Ponti
Unicredit, Rome, Italy
Lorenzo Severini
University of Porto, Porto, Portugal
Rita Ribeiro
University of Porto, Porto, Portugal
João Gama
UPC BarcelonaTech, Barcelona, Spain
Ricard Gavaldà
Northwestern University, Chicago, IL, USA
Lee Cooper
PD Personalised Healthcare, Basel, Switzerland
Naghmeh Ghazaleh
University of Lausanne, Lausanne, Switzerland
Jonas Richiardi
ETH Zurich, Basel, Switzerland
Damian Roqueiro
F. Hoffmann–La Roche Ltd, Basel, Switzerland
Diego Saldana Miranda
Novartis Pharma AG, Basel, Switzerland
Konstantinos Sechidis
University of Lisbon, Lisbon, Portugal
Guilherme Graça

Appendices

Appendix A: More Details of the Equations

The index j in Eq. (3) indicates the index of elements in the probability matrix \(P_{n_i*n_i}^{R_i}\). Also Z is the probability of the original unoccluded image I belonged to the class index Y, and \(\hat{Z}\) also shows the probability belonged to the class index Y, but when the occluded image I has been passed through the model. Therefore, to record the changes in the output probability, opposite to the Occlusion Test method that just records the output probability of occluded image, we record the normalized change of probability. As a result, each cell \([h_i^j.w_i^j ]\) in the probability matrix \(P_{n_i*n_i}^{R_i}\) indicates the normalized change of probability pertaining to a region of original image. The value in this cell shows the importance of that region in the form of normalized change of probability.

\(\gamma ^R_i\) in Eq. (4) indicates the probability matrix weight in the resolution \(R_i\). In order to see the heatmap in each resolution \(R_i\), the weight of the resolution \(R_i\) is set to 1 and the weight of the others is set to 0. In this equation, before performing the weighted sum, all probability matrix \(P_{n_i*n_i}^{R_i}\) are resized to the shape of the original image.

Appendix B: Details of Time Consumption

Table 1 shows the average time consumption of the FMO, RISE, LIME, Extremal Perturbation and Meaningful Perturbation methods on the models DenseNet121, InceptionV2, Inception V3, MobileNet, ResNet50. As can be seen, the proposed method, FMO, had the lowest time consumption over all models in comparison to the other methods, whereas Occlusion Test method had the highest time consumption. For example, FMO takes 1.90 s, 4.86 s, 2.71 s, 0.59 s and 2.70 s. On models DenseNet121, InceptionResNetV2, InceptionV3, MobileNet and ResNet50, respectively, which are far less than those of the Occlusion Test, RISE, LIME, Extremal Perturbation and Meaningful Perturbation methods in all of the five models.

Table 1. Average time consumption of the FMO, RISE, LIME, Extremal Perturbation and Meaningful Perturbation methods

Full size table

Appendix C: Details of Visual Accuracy

Table 2 shows the localization accuracy of the methods on DenseNet121 and ResNet50. As can be seen, FMO outperforms other methods in terms of localization accuracy and time consumption on two datasets VOC07 and COCO14.

Table 2. Localized accuracy of each method on two hard datasets.

Full size table

Figure 2 and 3 illustrate the visualization results on two datasets VOC07 and COCO14. According to these figures, FMO and Meaningful Perturbation methods can highlight properly the regions of interest in comparison to other methods.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Behzadi-Khormouji, H., Rostami, H. (2021). Enhancing Performance of Occlusion-Based Explanation Methods by a Hierarchical Search Method on Input Images. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1524. Springer, Cham. https://doi.org/10.1007/978-3-030-93736-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-93736-2_9
Published: 17 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93735-5
Online ISBN: 978-3-030-93736-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Enhancing Performance of Occlusion-Based Explanation Methods by a Hierarchical Search Method on Input Images