Abstract
Artificial intelligence developed rapidly, while people are increasingly concerned about internal structure in machine learning models. Starting from the definition of interpretability and historical process of interpretability model, this paper summarizes and analyzes the existing interpretability methods according to the two dimensions of model type and model time based on the objectives of interpretability model and different categories. With the help of the existing interpretable methods, this paper summarizes and analyzes its application value to the society analyzes the reasons why its application is hindered. This paper concretely analyzes and summarizes the applications in industrial fields, including model debugging, feature engineering and data collection. This paper aims to summarizes the shortcomings of the existing interpretability model, and proposes some suggestions based on them. Starting from the nature of interpretability model, this paper analyzes and summarizes the disadvantages of the existing model evaluation index, and puts forward the quantitative evaluation index of the model from the definition of interpretability. Finally, this paper summarizes the above and looks forward to the development direction of interpretability models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gunning, D.: Explainable artificial intelligence (xAI), Technical report, Defense Advanced Research Projects Agency (DARPA) (2017)
Molnar, C.: Interpretable machine learning (2019). https://christophm.github.io/interpretable-ml-book/. Accessed 22 Jan 2019
Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2018). [CrossRef]
Arrieta, A.B., et al.: Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020)
Carvalho, D.V., Pereira, E.M., Cardoso, J.S.: Machine learning interpretability: a survey on methods and metrics (2019)
Lipton, Z.C.: The mythos of model interpretability. arXiv 2016. arXiv:1606.03490
Bengio, Y., Courville, A., et al.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Cheng, H., et al.: SRI-Sarnoff AURORA at TRECVID 2014: multimedia event detection and recounting (2014)
Hendricks, L.A, Akata, Z., Rohrbach, M., Donahue, J., Schiele, B., Darrell, T.: Generating Visual Explanations. arXiv:1603.08507v1 [cs.CV], 28 Mar 2016
Deng, H.: Interpreting tree ensembles with intrees (2014). arXiv:1408.5456
Hara, S., Hayashi, K.: Making tree ensembles interpretable (2016). arXiv:1606.05390
Breiman, L.: Classification and Regression Trees. Routledge (2017)
Tolomei, G., Silvestri, F., Haines, A., Lalmas, M.: Interpretable predictions of tree-based ensembles via actionable feature tweaking. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 465–474. ACM (2017)
Fu, X., Ong, C., Keerthi, S., Hung, G.G., Goh, L.: Extracting the knowledge embedded in support vector machines. In: IEEE International Joint Conference on Neural Networks, vol. 1, pp. 291–296. IEEE (2004)
Gaonkar, B., Shinohara, R.T., Davatzikos, C., Initiative, A.D.N., et al.: Interpreting support vector machine models for multivariate group wise analysis in neuroimaging. Med. Image Anal. 24(1), 190–204 (2015)
Zilke, J., Loza MencÃa, E., Janssen, F.: DeepRED – rule extraction from deep neural networks. In: Calders, T., Ceci, M., Malerba, D. (eds.) DS 2016. LNCS (LNAI), vol. 9956, pp. 457–473. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46307-0_29
Traoré, R., Caselles-Dupré, H., Lesort, T., Sun, T., Cai, G., RodrÃguez, D. Filliat, DisCoRL: continual reinforcement learning via policy distillation (2019).
Shrikumar, A., Greenside, P., Shcherbina, A., Kundaje, A.: Not just a black box: learning important features through propagating activation differences (2016)
Arras, L., Montavon, G., Müller, K.-R., Samek, W.: Explaining recurrent neural network predictions in sentiment analysis (2017)
Krakovna, V., Doshi-Velez, F.: Increasing the interpretability of recurrent neural networks using hidden Markov models (2016)
Choi, E., Bahadori, M.T., Sun, J., Kulas, J., Schuetz, A., Stewart, W.: RETAIN: an interpretable predictive model for healthcare using reverse time attention mechanism, In: Advances in Neural Information Processing Systems, pp. 3504–3512 (2016)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)
Zhang, Q., Nian Wu, Y., Zhu, S.-C.: Interpretable convolutional neural networks, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8827–8836 (2018)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Dong, Y., Su, H., Zhu, J., Zhang, B.: Improving interpretability of deep neural networks with semantic information. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4306–4314 (2017)
Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6541–6549 (2017)
Olah, C., et al.: The building blocks of interpretability, Distill (2018)
Ribeiro, M.T., Singh, S., Guestrin, C.: Model-agnostic interpretability of machine learning (2016)
Papernot, N., McDaniel, P.: Deep k-nearest neighbors: towards confident, interpretable and robust deep learning (2018)
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv (2017). arXiv:1702.08608
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 IFIP International Federation for Information Processing
About this paper
Cite this paper
Lin, KY., Liu, Y., Li, L., Dou, R. (2021). A Review of Explainable Artificial Intelligence. In: Dolgui, A., Bernard, A., Lemoine, D., von Cieminski, G., Romero, D. (eds) Advances in Production Management Systems. Artificial Intelligence for Sustainable and Resilient Production Systems. APMS 2021. IFIP Advances in Information and Communication Technology, vol 633. Springer, Cham. https://doi.org/10.1007/978-3-030-85910-7_61
Download citation
DOI: https://doi.org/10.1007/978-3-030-85910-7_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85909-1
Online ISBN: 978-3-030-85910-7
eBook Packages: Computer ScienceComputer Science (R0)