Trustworthiness and efficiency have recently become crucial aspects of applied AI. The intersection of interpretability and model compression, however, still poses significant challenges. As models undergo compression for improved efficiency, maintaining explainability needs to remain a priority. In this paper, we propose a novel metric to evaluate both aspects simultaneously and help practitioners navigate this trade-off. In particular, we delve into the effect that knowledge distillation, quantization, and pruning have on the Infidelity explainability metric. Our goal is for \( Xpression \) metric to guide the optimization of compression whilst the model keeps its infidelity robustness. Experimental results across several neural network architectures show the effectiveness of the proposed metric in combining efficiency and relative interpretability with respect to the original model. This work contributes to advancing the understanding of compression techniques and provides a valuable framework for evaluating their implications on model interpretability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Xie, Q., Luong, M.-T., Hovy, E., Le, Q.V.: Self-training with noisy student improves ImageNet classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Zhu, M., Gupta, S.: To prune, or not to prune: exploring the efficacy of pruning for model compression (2017). arXiv:1710.01878
Wu, H., Judd, P., Zhang, X., Isaev, M., Micikevicius, P.: Integer quantization for deep learning inference: Principles and empirical evaluation (2020). arXiv preprint arXiv:2004.09602
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vis. (IJCV) 129(6), 1789–1819 (2021)
Choudhary, T., Mishra, V., Goswami, A., Sarangapani, J.: A comprehensive survey on model compression and acceleration. Artif. Intell. Rev. 53, 5113–5155 (2020)
Bell, A., Solano-Kamaiko, I., Nov, O., Stoyanovich, J.: It’s just not that simple: an empirical study of the accuracy-explainability trade-off in machine learning for public policy. In: ACM Conference on Fairness, Accountability, and Transparency, pp. 248–266 (2022)
Wu, C.-J., et al.: Sustainable AI: Environmental implications, challenges and opportunities (2021). ArXiv, abs/2111.00364
Batic, D., Tanoni, G., Stankovic, L., Stankovic, V., Principi, E.: Improving knowledge distillation for non-intrusive load monitoring through explainability guided learning. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE (2023)
Alharbi, R., Vu, M.N., Thai, M.T.: Learning interpretation with explainable knowledge distillation. In: Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), pp. 705–714 (2021)
Alharbi, R., Vu, M.N., Thai, M.T.: Dissecting pruned neural networks. In: International Conference on Learning Representations workshop (ICLRw) (2019)
Luo, X., Chi, W., Deng, M.: Deepprune: Learning efficient and interpretable convolutional networks through weight pruning for predicting DNA-protein binding. Front. Genet. 10, 1145 (2019)
Becking, D., Dreyer, M., Samek, W., Müller, K., Lapuschkin, S.: ECQx: explainability-driven quantization for low-bit and sparse DNNs. In: International Workshop on Extending Explainable AI Beyond Deep Models and Classifiers (2020)
Tashiro, Y., Awano, H.: Pay attention via quantization: enhancing explainability of neural networks via quantized activation. IEEE Access 11, 34431–34439 (2023)
Dardouillet, P., Benoit, A., Amri, E., Bolon, P., Dubucq, D., Crédoz, A.: Explainability of image semantic segmentation through SHAP values. In: Proceedings of the International Conference on Pattern Recognition (ICPR), pp. 188–202 (2022)
Karri, M., Annavarapu, C.S.R., Rajendra Acharya, U.: Explainable multi-module semantic guided attention based network for medical image segmentation. Comput. Biol. Med. 151, 106231 (2022)
Yeh, C.-K., Hsieh, C.-Y., Suggala, A., Inouye, D.I., Ravikumar, P.K.: On the (in) fidelity and sensitivity of explanations. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Longo, L., et al.: Explainable artificial intelligence (XAI) 2.0: a manifesto of open challenges and interdisciplinary research directions. Inf. Fusion 106, 102301 (2024)
Hoefler, T., Alistarh, D., Ben-Nun, T., Dryden, N., Peste, A.: Sparsity in deep learning: pruning and growth for efficient inference and training in neural networks. J. Mach. Learn. Res. 22(241), 1–124 (2021)
Hassija, V., et al.: Interpreting black-box models: a review on explainable artificial intelligence. Cogn. Comput. 16, 45–74 (2024)
Mishra, S., Dutta, S., Long, J., Magazzeni, D.: A survey on the robustness of feature importance and counterfactual explanations. In: Workshop on Explainable AI in Finance (XAI-FIN21) (2021)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 618–626 (2017)
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: Visualising image classification models and saliency maps (2013). arXiv preprint arXiv:1312.6034
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems NeurIPS, vol. 30 (2017)
Ribeiro, M.T., Singh, S., Guestrin, C.: why should i trust you? explaining the predictions of any classifier. In: ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 1135–1144 (2016)
Ba, J., Caruana, R.: Do deep nets really need to be deep? In: Advances in Neural Information Processing Systems (NeurIPS), vol. 27 (2014)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. Advances in Neural Information Processing Systems (NeurIPS), vol. 27 (2014)
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: International Conference on Learning Representations (ICLR) (2017)
Lee, S.H., Kim, D.H., Song, B.C.: Self-supervised knowledge distillation using singular value decomposition. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pp. 335–350 (2019)
Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4320–4328 (2018)
Chung, I., Park, S., Kim, J., Kwak, N.: Feature-map-level online adversarial knowledge distillation. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 2006–2015 (2020)
Tian, Y., Krishnan, D., Isola, P.: Contrastive representation distillation. In: International Conference on Learning Representations (ICLR) (2020)
Frankle, J., Carbin, M.: The lottery ticket hypothesis: finding sparse, trainable neural networks. In: International Conference on Learning Representations (ICLR) (2019)
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. In: International Conference on Learning Representations (ICLR) (2018)
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2736–2744 (2017)
Yu, R., et al.: NISP: pruning networks using neuron importance score propagation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9194–9203 (2018)
Wang, H., Qin, C., Zhang, Y., Fu, Y.: Neural pruning via growing regularization. In: International Conference on Learning Representations (ICLR) (2020)
Fang, G., Ma, X., Song, M., Mi, M.B., Wang, X.: DepGraph: towards any structural pruning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16091–16101 (2023)
Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. In: Low-Power Computer Vision. Chapman and Hall/CRC (2022)
Gray, R.M., Neuhoff, D.L.: Quantization. IEEE Trans. Inf. Theor. 44, 2325–2383 (1998)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. In: International Conference on Learning Representations (ICLR) (2015)
Nagel, M., Amjad, R.A., Van Baalen, M., Louizos, C., Blankevoort, T.: Up or down? adaptive rounding for post-training quantization. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 7197–7206 (2020)
Zhewei Yao, et al.: HAWQ-V3: dyadic neural network quantization. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 11875–11886 (2021)
Dong, Z., Yao, Z., Arfeen, D., Gholami, A., Mahoney, M.W., Keutzer, K.: HAWQ-V2: hessian aware trace-weighted quantization of neural networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 33 (2020)
Dong, Z., Yao, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: Hawq: Hessian aware quantization of neural networks with mixed-precision. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), vol. 32 (2019)
Yang, J., et al.: Quantization networks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12434–12443 (2019)
Wang, L., Dong, X., Wang, Y., Liu, L., An, W., Guo, Y.: Learnable lookup table for neural network quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12423–12433 (2022)
Ma, Y., et al.: OMPQ: orthogonal mixed precision quantization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 9029–9037 (2023)
Sun, T., Chen, H., Hu, G., Zhao, C.: Explainability-based knowledge distillation (2023). Available at SSRN 4460609
Sousa, J., Moreira, R., Balayan, V., Saleiro, P., Bizarro, P.: ConceptDistil: model-agnostic distillation of concept explanations. In: International Conference on Learning Representations (ICLR) (2022)
Termritthikun, C., Umer, A., Suwanwimolkul, S., Xia, F., Lee, I.: Explainable knowledge distillation for on-device chest X-ray classification. IEEE/ACM Trans. Comput. Biol. Bioinform. 1–12 (2023)
Liu, X., Wang, X., Matwin, S.: Improving the interpretability of deep neural networks with knowledge distillation. In: IEEE International Conference on Data Mining Workshops (ICDMW) (2018)
Li, Y., Liu, L., Wang, G., Yong, D., Chen, P.: EGNN: constructing explainable graph neural networks via knowledge distillation. Knowl. Based Syst. 241, 108345 (2022)
Han, H., Kim, S., Choi, H.-S., Yoon, S.: On the impact of knowledge distillation for model interpretability. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 12389–12410 (2023)
Weber, D., Merkle, F., Schöttle, P., Schlögl, S.: Less is more: The influence of pruning on the explainability of CNNs (2023). arXiv:2302.08878
Norrenbrock, T., Rudolph, M., Rosenhahn, B.: Q-SENN: quantized self-explaining neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 21482–21491 (2023)
Sabih, M., Hannig, F., Teich, J.: Utilizing explainable AI for quantization and pruning of deep neural networks (2020). arXiv:2008.09072
Smilkov, D., Kim, B., Thorat, N., Viégas, F., Wattenberg, M.: SmoothGrad: removing noise by adding noise (2017). arXiv:1706.03825
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hu, Y.: Knowledge distillation zoo. GitHub Repos. (2019). GitHub. https://github.com/AberHu/Knowledge-Distillation-Zoo
Li, Y., Dong, X., Wang, W.: Additive powers-of-two quantization: an efficient non-uniform discretization for neural networks. In: International Conference on Learning Representations (ICLR) (2020)
Kokhlikyan, N., et al.: A Unified and Generic Model Interpretability Library for Pytorch, Captum (2020)
The authors want to thank the European Commission for the funding under the Horizon Europe programme MANOLO Grant Agreement No.101135782.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Arazo, E., Stoev, H., Bosch, C., Suárez-Cetrulo, A.L., Simón-Carbajo, R. (2024). \( Xpression \): A Unifying Metric to Optimize Compression and Explainability Robustness of AI Models. In: Longo, L., Lapuschkin, S., Seifert, C. (eds) Explainable Artificial Intelligence. xAI 2024. Communications in Computer and Information Science, vol 2153. Springer, Cham. https://doi.org/10.1007/978-3-031-63787-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-63787-2_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63786-5
Online ISBN: 978-3-031-63787-2
eBook Packages: Computer ScienceComputer Science (R0)