Abstract
Recently, few methods for understanding machine learning model’s outputs have been developed. SHAP and LIME are two well-known examples of these methods. They provide individual explanations based on feature importance for each instance. While remarkable scores have been achieved for individual explanations, understanding the model’s decisions globally remains a complex task. Methods like LIME were extended to face this complexity by using individual explanations. In this approach, the problem was expressed as a submodular optimization problem. This algorithm is a bottom-up method aiming at providing a global explanation. It consists of picking a group of individual explanations which illustrate the global behavior of the model and avoid redundancy. In this paper, we propose CoSP (Co-Selection Pick) framework that allows a global explainability of any black-box model by selecting individual explanations based on a similarity preserving approach. Unlike submodular optimization, in our method the problem is considered as a co-selection task. This approach achieves a co-selection of instances and features over the explanations provided by any explainer. The proposed framework is more generic given that it is possible to make the co-selection either in supervised or unsupervised scenarios and also over explanations provided by any local explainer. Preliminary experimental results are made to validate our proposal.









Similar content being viewed by others
Availability of data and materials
Not applicable.
References
Mohaghegh, F., Murthy, J.: Machine learning and computer vision techniques to predict thermal properties of particulate composites. CoRR. abs/2010.01968 (2020). arXiv:2010.01968
Holm, E.A., Cohn, R., ao, N., Kitahara, A.R., Matson, T.P., Lei, B., Yarasi, S.R.: Overview: Computer vision and machine learning for microstructural characterization and analysis. CoRR. abs/2005.14260 (2020). arXiv:2005.14260. https://doi.org/10.1007/s11661-020-06008-4
Kosowski, P.: Deep learning for natural language processing and language modelling. In: 2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), pp. 223–228 (2018). https://doi.org/10.23919/SPA.2018.8563389
Shailaja, K., Seetharamulu, B., Jabbar, M.A.: Machine learning in healthcare: a review. In: 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 910–914 (2018). https://doi.org/10.1109/ICECA.2018.8474918
Bistron, M., Piotrowski, Z.: Artificial intelligence applications in military systems and their influence on sense of security of citizens. Electronics. 10(7), (2021). https://www.mdpi.com/2079-9292/10/7/871
Gunning, D., Aha, D.: Darpas explainable artificial intelligence (xai) program. AI. Mag. 40(2), 44–58 (2019)
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5), (2018). https://doi.org/10.48550/arXiv.1802.01933
Strumbelj, E., Kononenko, I.: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41, 647–665 (2013)
Lundberg, S., Lee, S.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, pp. 4765–4774 (2017)
Lundberg, S., Erion, G., Chen, H., DeGrave, A., Prutkin, J., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., Lee, S.: Explainable ai for trees: From local explanations to global understanding. ArXiv. abs/1905.04610, (2019)
Ribeiro, M., Singh, S., Guestrin, C.: “why should I trust you?”: Explaining the predictions of any classifier. In: al., B.K. (ed.) Proc. of the 22nd ACM SIGKDD Inter. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, pp. 1135–1144. ACM, ??? (2016). https://doi.org/10.1145/2939672.2939778
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J., Liu, H.: Feature selection: A data perspective. ACM computing surveys (CSUR). 50(6), 1–45 (2017)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)
Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34, 133–143 (2010)
Liu, H., Motoda, H.: On issues of instance selection. Data Min. Knowl. Disc. 6(2), 115 (2002)
Li, Y.-F., Zhou, Z.-H.: Improving semi-supervised support vector machines through unlabeled instances selection. In: Proceedings of the AAAI Conference on Artificial Intelligence, 25, pp. 386–391 (2011)
Kuncheva, L.I., Jain, L.C.: Nearest neighbor classifier: Simultaneous editing and feature selection. Pattern Recogn. Lett. 20(11–13), 1149–1156 (1999)
Derrac, J., García, S., Herrera, F.: Ifs-coco: Instance and feature selection based on cooperative coevolution with nearest neighbor rule. Pattern Recogn. 43(6), 2082–2105 (2010)
GarcíA-Pedrajas, N., De Haro-GarcíA, A., PéRez-RodríGuez, J.: A scalable approach to simultaneous evolutionary instance and feature selection. Inf. Sci. 228, 150–174 (2013)
Zhao, Z., Wang, L., Liu, H., Ye, J.: On similarity preserving feature selection. IEEE Trans. Knowl. Data Eng. 25(3), 619–632 (2011)
Ma, Y., Xu, X., Shen, F., Shen, H.T.: Similarity preserving feature generating networks for zero-shot learning. Neurocomputing 406, 333–342 (2020)
Shang, R., Chang, J., Jiao, L., Xue, Y.: Unsupervised feature selection based on self-representation sparse regression and local similarity preserving. Int. J. Mach. Learn. & Cybernet. 10, 757–770 (2019)
Code for COsP, howpublished = https://github.com/KhaoulaBF/CoSPIctai/blob/main/dvd_features_scos%20(2).ipynb,
Meddahi, K., Benkabou, S.-E., Hadjali, A., Mesmoudi, A., El Kefel Mansouri, D., Benabdeslem, K., Chaib, S.: Towards a co-selection approach for a global explainability of black box machine learning models. In: International Conference on Web Information Systems Engineering, pp. 97–109 (2022). Springer
Martens, D., Provost, F.: Explaining data-driven document classifications. MIS quarterly. 38(1), 73–100 (2014)
Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., Müller, K.-R.: How to explain individual classification decisions. J. Mach. Learn. Res. 11, 1803–1831 (2010)
Ribeiro, M., Singh, S., Guestrin, C.: Fairness, Accountability, and Transparency in Machine Learning, paper ‘Why Should I Trust You?’ Explaining the Predictions of Any Classifier. https://www.fatml.org/schedule/2016/presentation/why-should-i-trust-you-explaining-predictions (2016)
Burkart, N., Huber, M.F.: A survey on the explainability of supervised machine learning. J. Artif. Intell. Res. 70, 245–317 (2021)
Lundberg, S.M., Erion, G., Chen, H., DeGrave, A., Prutkin, J.M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., Lee, S.-I.: From local explanations to global understanding with explainable ai for trees. Nat. Mach. Intell. 2(1), 56–67 (2020)
Vlahek, D., Mongus, D.: An efficient iterative approach to explainable feature learning. IEEE Trans. Neural Netw. & Learn, Syst (2021)
Dinh, V.C., Ho, L.S.: Consistent feature selection for analytic deep neural networks. Adv Neural Inf. Proc. Syst. 33, 2420–2431 (2020)
Cancela, B., Bolón-Canedo, V., Alonso-Betanzos, A.: E2e-fs: An end-to-end feature selection method for neural networks. IEEE Trans. Pattern Anal. & Mach, Intell (2022)
Wang, X., Wu, Y., Zhang, A., Feng, F., He, X., Chua, T.-S.: Reinforced causal explainer for graph neural networks. IEEE Trans. Pattern. Anal. & Mach, Intell (2022)
Böhle, M., Fritz, M., Schiele, B.: Optimising for interpretability: Convolutional dynamic alignment networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. (2022)
Shrikumar, A., Greenside, P., Shcherbina, A., Kundaje, A.: Not just a black box: Learning important features through propagating activation differences. arXiv:1605.01713 (2016)
Krishnan, S., Wu, E.: Palm: Machine learning explanations for iterative debugging. In: Proceedings of the 2Nd Workshop on Human-in-the-loop Data Analytics, pp. 1–6 (2017)
Ribeiro, M.T., Singh, S., Guestrin, C.: Anchors: High-precision model-agnostic explanations. In: Proc. AAAI Conf. Artif. Intell. vol. 32 (2018)
Zhou, B., Bau, D., Oliva, A., Torralba, A.: Interpreting deep visual representations via network dissection. IEEE Trans. Pattern. Anal. & Mach. Intell. 41(9), 2131–2145 (2018)
Schwab, P., Karlen, W.: Cxplain: Causal explanations for model interpretation under uncertainty. Adv. Neural Inf. Proc. Syst. 32 (2019)
Yang, M., Kim, B.: Benchmarking attribution methods with relative feature importance. arXiv:1907.09701 (2019)
Albini, E., Rago, A., Baroni, P., Toni, F.: Relation-based counterfactual explanations for bayesian network classifiers. In: IJCAI, pp. 451–457 (2020)
Agarwal, R., Melnick, L., Frosst, N., Zhang, X., Lengerich, B., Caruana, R., Hinton, G.E.: Neural additive models: Interpretable machine learning with neural nets. Adv. Neural Inf. Proc. Syst. 34, 4699–4711 (2021)
Liu, Y., Khandagale, S., White, C., Neiswanger, W.: Synthetic benchmarks for scientific research in explainable machine learning. In: Adv. Neural Inf. Proc. Syst. Datasets Track. (2021)
Graziani, M., Lompech, T., Müller, H., Andrearczyk, V.: Evaluation and comparison of cnn visual explanations for histopathology. In: Proceedings of the AAAI Conference on Artificial Intelligence Workshops (XAI-AAAI-21), Virtual Event, pp. 8–9 (2021)
Agarwal, C., Krishna, S., Saxena, E., Pawelczyk, M., Johnson, N., Puri, I., Zitnik, M., Lakkaraju, H.: Openxai: Towards a transparent evaluation of model explanations. Adv. Neural Inf. Proc. Syst. 35, 15784–15799 (2022)
Li, X., Xiong, H., Li, X., Wu, X., Chen, Z., Dou, D.: Interpretdl: explaining deep models in paddlepaddle. J. Mach. Learn. Res. 23(1), 8969–8974 (2022)
Motallebi, M., Anik, M.T.A., Zaïane, O.R.: Explaining decisions of black-box models using barbe. In: Int. Conf. Database & Expert Syst. Appl. pp. 82–97 (2023). Springer
Agarwal, C., Queen, O., Lakkaraju, H., Zitnik, M.: Evaluating explainability for graph neural networks. Scientific Data. 10(1), 144 (2023)
Benabdeslem, K., Mansouri, D.E.K., Makkhongkaew, R.: scos: Semi-supervised co-selection by a similarity preserving approach. IEEE Trans. Knowl. Data Eng. 34(6), 2899–2911 (2022). https://doi.org/10.1109/TKDE.2020.3014262
She, Y., Owen, A.-B.: Outlier detection using nonconvex penalized regression. CoRR. abs/1006.2592 (2010). arXiv:1006.2592
Tong, H., Lin, C.: Non-negative residual matrix factorization with application to graph anomaly detection. In: Proceedings of the Eleventh SIAM International Conference on Data Mining, SDM 2011, April 28-30, 2011, Mesa, Arizona, USA, pp. 143–153. SIAM / Omnipress, ??? (2011)
Tang, J., Liu, H.: Coselect: Feature selection with instance selection for social media data. In: Proceedings of the 13th SIAM International Conference on Data Mining, May 2-4, 2013. Austin, Texas, USA, pp. 695–703. SIAM, ??? (2013)
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., Gao, J.: Deep learning-based text classification. ACM Computing Surveys (CSUR). 54, 1–40 (2021)
Linden, I.-V.-D., Haned, H., Kanoulas, E.: Global aggregations of local explanations for black box models. CoRR. abs/1907.03039, (2019). arXiv:1907.03039
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
1- Dou El Kefel Mansouri: Methodology, Software, Validation, Formal analysis, Investigation, Writing- Original draft preparation.
2- Seif-Eddine Benkabou: Conceptualization of this study, Methodology, Software, Validation, Formal analysis, Investigation, Writing- Original draft preparation.
3- Khaoula Meddahi: Conceptualization of this study, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing- Original draft preparation.
4- Allel Hadjali: Methodology, Investigation, Visualization.
5- Amin Mesmoudi: Software, Methodology, Investigation, Visualization.
6- Khalid Benabdeslem: Methodology, Investigation, Visualization.
7- Souleyman Chaib: Methodology, Investigation, Visualization.
**All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no confict of interest.
Ethical Approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Special Issue on Web Information Systems Engineering 2022
Guest Editors: Richard Chbeir, Helen Huang, Yannis Manolopoulos and Fabrizio Silvestri.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mansouri, D.E.K., Benkabou, SE., Meddahi, K. et al. CoSP: co-selection pick for a global explainability of black box machine learning models. World Wide Web 26, 3965–3981 (2023). https://doi.org/10.1007/s11280-023-01213-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-023-01213-8