Abstract
The field of Artificial Intelligence is growing at a very high pace. Application of bigger and complex algorithms have become commonplace, thus making them harder to understand. The explainability of the algorithms and models in practice has become a necessity as these models are being widely adopted to make significant and consequential decisions. It makes it even more important for us to keep our understanding of the decisions and results of AI up to date. Explainable AI methods are currently addressing the interpretability, explainability, and fairness in supervised learning methods. There has been very less focus on explaining the results of unsupervised learning methods. This paper proposes an extension of the supervised explainability methods to deal with the unsupervised methods as well. We have researched and experimented with widely used clustering models to show the applicability of the proposed solution on most practiced unsupervised problems. We also have thoroughly investigated the methods to validate the results of both supervised and unsupervised explainability modules.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hastie, T., Tibshirani, R.: Generalized additive models: some applications. J. Am. Stat. Assoc. 82(398), 371–386 (1987)
Müller, M.: Generalized linear models. In: Gentle, J., Härdle, W., Mori, Y. (eds.) Handbook of Computational Statistics. Springer Handbooks of Computational Statistics. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-21551-3_24
Ribeiro, M.T., Singh, S., Guestrin, C.: “why should I trust you?”: Explaining the predictions of any classifier. CoRR abs/1602.04938 (2016)
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp. 4765–4774 (2017)
Kanungo, T., et al.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002). https://doi.org/10.1109/TPAMI.2002.1017616
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD 1996, pp. 226–231. AAAI Press (1996)
Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. ACM SIGMOD Rec. 25(2), 103–114 (1996)
Sasirekha, K., Baby, P.: Agglomerative hierarchical clustering algorithm-a. Int. J. Sci. Res. Publ. 83(3), 83 (2013)
Wickramasinghe, C.S., Amarasinghe, K., Marino, D.L., Rieger, C., Manic, M.: Explainable unsupervised machine learning for cyber-physical systems. IEEE Access 9, 131824–131843 (2021). https://doi.org/10.1109/ACCESS.2021.3112397
Kauffmann, J.R., Esders, M., Montavon, G., Samek, W., Müller, K.-R.: From clustering to cluster explanations via neural networks. CoRR abs/1906.07633 (2019)
Montavon, G., Kauffmann, J., Samek, W., Müller, K.R.: Explaining the predictions of unsupervised learning models. In: Holzinger, A., Goebel, R., Fong, R., Moon, T., Müller, KR., Samek, W. (eds.) xxAI - Beyond Explainable AI, xxAI 2020. Lecture Notes in Computer Science(), vol. 13200. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04083-2_7, ISBN 978-3-031-04083-2
Morichetta, A., Casas, P., Mellia, M.: EXPLAIN-IT: towards explainable AI for unsupervised network traffic analysis. CoRR abs/2003.01670 (2020)
Dasgupta, S., Frost, N., Moshkovitz, M., Rashtchian, C.: Explainable k- means and k-medians clustering. CoRR abs/2002.12538 (2020)
Bandyapadhyay, S., Fomin, F.V., Golovach, P.A., Lochet, W., Purohit, N., Simonov, K.: How to find a good explanation for clustering? CoRR abs/2112.06580 (2021)
Gamlath, B., Jia, X., Polak, A., Svensson, O.: Nearly-tight and oblivious algorithms for explainable clustering. CoRR abs/2106.16147 (2021). https://arxiv.org/abs/2106.16147
Fisher, R.A.: Iris. UCI Machine Learning Repository (1988). https://archive.ics.uci.edu/ml/index.php
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001). https://doi.org/10.1023/A:1010950718922
Ke, G., et al.: Lightgbm: a highly efficient gradient boosting decision tree. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)
Zwillinger, D., Kokoska, S.: CRC Standard Probability and Statistics Tables and Formulae. CRC Press, USA (1999)
Wine UCI. Wine. UCI Machine Learning Repository (1991). https://archive.ics.uci.edu/ml/index.php
Cinar, I., Koklu, M.: Classification of rice varieties using artificial intelligence methods. Int. J. Intell. Syst. Appl. Eng. 7(3), 188–194 (2019). https://doi.org/10.18201/ijisae.2019355381
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Arora, M., Chopra, A. (2023). Explainability for Clustering Models. In: Yusoff, M., Hai, T., Kassim, M., Mohamed, A., Kita, E. (eds) Soft Computing in Data Science. SCDS 2023. Communications in Computer and Information Science, vol 1771. Springer, Singapore. https://doi.org/10.1007/978-981-99-0405-1_1
Download citation
DOI: https://doi.org/10.1007/978-981-99-0405-1_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-0404-4
Online ISBN: 978-981-99-0405-1
eBook Packages: Computer ScienceComputer Science (R0)