Skip to main content

Advertisement

Log in

CoSP: co-selection pick for a global explainability of black box machine learning models

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Recently, few methods for understanding machine learning model’s outputs have been developed. SHAP and LIME are two well-known examples of these methods. They provide individual explanations based on feature importance for each instance. While remarkable scores have been achieved for individual explanations, understanding the model’s decisions globally remains a complex task. Methods like LIME were extended to face this complexity by using individual explanations. In this approach, the problem was expressed as a submodular optimization problem. This algorithm is a bottom-up method aiming at providing a global explanation. It consists of picking a group of individual explanations which illustrate the global behavior of the model and avoid redundancy. In this paper, we propose CoSP (Co-Selection Pick) framework that allows a global explainability of any black-box model by selecting individual explanations based on a similarity preserving approach. Unlike submodular optimization, in our method the problem is considered as a co-selection task. This approach achieves a co-selection of instances and features over the explanations provided by any explainer. The proposed framework is more generic given that it is possible to make the co-selection either in supervised or unsupervised scenarios and also over explanations provided by any local explainer. Preliminary experimental results are made to validate our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Algorithm 3
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and materials

Not applicable.

Notes

  1. https://www.cs.jhu.edu/~mdredze/datasets/sentiment/

References

  1. Mohaghegh, F., Murthy, J.: Machine learning and computer vision techniques to predict thermal properties of particulate composites. CoRR. abs/2010.01968 (2020). arXiv:2010.01968

  2. Holm, E.A., Cohn, R., ao, N., Kitahara, A.R., Matson, T.P., Lei, B., Yarasi, S.R.: Overview: Computer vision and machine learning for microstructural characterization and analysis. CoRR. abs/2005.14260 (2020). arXiv:2005.14260. https://doi.org/10.1007/s11661-020-06008-4

  3. Kosowski, P.: Deep learning for natural language processing and language modelling. In: 2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), pp. 223–228 (2018). https://doi.org/10.23919/SPA.2018.8563389

  4. Shailaja, K., Seetharamulu, B., Jabbar, M.A.: Machine learning in healthcare: a review. In: 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 910–914 (2018). https://doi.org/10.1109/ICECA.2018.8474918

  5. Bistron, M., Piotrowski, Z.: Artificial intelligence applications in military systems and their influence on sense of security of citizens. Electronics. 10(7), (2021). https://www.mdpi.com/2079-9292/10/7/871

  6. Gunning, D., Aha, D.: Darpas explainable artificial intelligence (xai) program. AI. Mag. 40(2), 44–58 (2019)

  7. Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5), (2018). https://doi.org/10.48550/arXiv.1802.01933

  8. Strumbelj, E., Kononenko, I.: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41, 647–665 (2013)

    Article  Google Scholar 

  9. Lundberg, S., Lee, S.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, pp. 4765–4774 (2017)

  10. Lundberg, S., Erion, G., Chen, H., DeGrave, A., Prutkin, J., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., Lee, S.: Explainable ai for trees: From local explanations to global understanding. ArXiv. abs/1905.04610, (2019)

  11. Ribeiro, M., Singh, S., Guestrin, C.: “why should I trust you?”: Explaining the predictions of any classifier. In: al., B.K. (ed.) Proc. of the 22nd ACM SIGKDD Inter. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, pp. 1135–1144. ACM, ??? (2016). https://doi.org/10.1145/2939672.2939778

  12. Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J., Liu, H.: Feature selection: A data perspective. ACM computing surveys (CSUR). 50(6), 1–45 (2017)

    Article  Google Scholar 

  13. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)

  14. Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34, 133–143 (2010)

    Article  Google Scholar 

  15. Liu, H., Motoda, H.: On issues of instance selection. Data Min. Knowl. Disc. 6(2), 115 (2002)

    Article  MathSciNet  Google Scholar 

  16. Li, Y.-F., Zhou, Z.-H.: Improving semi-supervised support vector machines through unlabeled instances selection. In: Proceedings of the AAAI Conference on Artificial Intelligence, 25, pp. 386–391 (2011)

  17. Kuncheva, L.I., Jain, L.C.: Nearest neighbor classifier: Simultaneous editing and feature selection. Pattern Recogn. Lett. 20(11–13), 1149–1156 (1999)

    Article  Google Scholar 

  18. Derrac, J., García, S., Herrera, F.: Ifs-coco: Instance and feature selection based on cooperative coevolution with nearest neighbor rule. Pattern Recogn. 43(6), 2082–2105 (2010)

    Article  Google Scholar 

  19. GarcíA-Pedrajas, N., De Haro-GarcíA, A., PéRez-RodríGuez, J.: A scalable approach to simultaneous evolutionary instance and feature selection. Inf. Sci. 228, 150–174 (2013)

    Article  MathSciNet  Google Scholar 

  20. Zhao, Z., Wang, L., Liu, H., Ye, J.: On similarity preserving feature selection. IEEE Trans. Knowl. Data Eng. 25(3), 619–632 (2011)

    Article  Google Scholar 

  21. Ma, Y., Xu, X., Shen, F., Shen, H.T.: Similarity preserving feature generating networks for zero-shot learning. Neurocomputing 406, 333–342 (2020)

    Article  Google Scholar 

  22. Shang, R., Chang, J., Jiao, L., Xue, Y.: Unsupervised feature selection based on self-representation sparse regression and local similarity preserving. Int. J. Mach. Learn. & Cybernet. 10, 757–770 (2019)

    Article  Google Scholar 

  23. Code for COsP, howpublished = https://github.com/KhaoulaBF/CoSPIctai/blob/main/dvd_features_scos%20(2).ipynb,

  24. Meddahi, K., Benkabou, S.-E., Hadjali, A., Mesmoudi, A., El Kefel Mansouri, D., Benabdeslem, K., Chaib, S.: Towards a co-selection approach for a global explainability of black box machine learning models. In: International Conference on Web Information Systems Engineering, pp. 97–109 (2022). Springer

  25. Martens, D., Provost, F.: Explaining data-driven document classifications. MIS quarterly. 38(1), 73–100 (2014)

    Google Scholar 

  26. Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., Müller, K.-R.: How to explain individual classification decisions. J. Mach. Learn. Res. 11, 1803–1831 (2010)

    MathSciNet  Google Scholar 

  27. Ribeiro, M., Singh, S., Guestrin, C.: Fairness, Accountability, and Transparency in Machine Learning, paper ‘Why Should I Trust You?’ Explaining the Predictions of Any Classifier. https://www.fatml.org/schedule/2016/presentation/why-should-i-trust-you-explaining-predictions (2016)

  28. Burkart, N., Huber, M.F.: A survey on the explainability of supervised machine learning. J. Artif. Intell. Res. 70, 245–317 (2021)

    Article  MathSciNet  Google Scholar 

  29. Lundberg, S.M., Erion, G., Chen, H., DeGrave, A., Prutkin, J.M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., Lee, S.-I.: From local explanations to global understanding with explainable ai for trees. Nat. Mach. Intell. 2(1), 56–67 (2020)

    Article  Google Scholar 

  30. Vlahek, D., Mongus, D.: An efficient iterative approach to explainable feature learning. IEEE Trans. Neural Netw. & Learn, Syst (2021)

    Google Scholar 

  31. Dinh, V.C., Ho, L.S.: Consistent feature selection for analytic deep neural networks. Adv Neural Inf. Proc. Syst. 33, 2420–2431 (2020)

    Google Scholar 

  32. Cancela, B., Bolón-Canedo, V., Alonso-Betanzos, A.: E2e-fs: An end-to-end feature selection method for neural networks. IEEE Trans. Pattern Anal. & Mach, Intell (2022)

    Google Scholar 

  33. Wang, X., Wu, Y., Zhang, A., Feng, F., He, X., Chua, T.-S.: Reinforced causal explainer for graph neural networks. IEEE Trans. Pattern. Anal. & Mach, Intell (2022)

  34. Böhle, M., Fritz, M., Schiele, B.: Optimising for interpretability: Convolutional dynamic alignment networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. (2022)

  35. Shrikumar, A., Greenside, P., Shcherbina, A., Kundaje, A.: Not just a black box: Learning important features through propagating activation differences. arXiv:1605.01713 (2016)

  36. Krishnan, S., Wu, E.: Palm: Machine learning explanations for iterative debugging. In: Proceedings of the 2Nd Workshop on Human-in-the-loop Data Analytics, pp. 1–6 (2017)

  37. Ribeiro, M.T., Singh, S., Guestrin, C.: Anchors: High-precision model-agnostic explanations. In: Proc. AAAI Conf. Artif. Intell. vol. 32 (2018)

  38. Zhou, B., Bau, D., Oliva, A., Torralba, A.: Interpreting deep visual representations via network dissection. IEEE Trans. Pattern. Anal. & Mach. Intell. 41(9), 2131–2145 (2018)

  39. Schwab, P., Karlen, W.: Cxplain: Causal explanations for model interpretation under uncertainty. Adv. Neural Inf. Proc. Syst. 32 (2019)

  40. Yang, M., Kim, B.: Benchmarking attribution methods with relative feature importance. arXiv:1907.09701 (2019)

  41. Albini, E., Rago, A., Baroni, P., Toni, F.: Relation-based counterfactual explanations for bayesian network classifiers. In: IJCAI, pp. 451–457 (2020)

  42. Agarwal, R., Melnick, L., Frosst, N., Zhang, X., Lengerich, B., Caruana, R., Hinton, G.E.: Neural additive models: Interpretable machine learning with neural nets. Adv. Neural Inf. Proc. Syst. 34, 4699–4711 (2021)

    Google Scholar 

  43. Liu, Y., Khandagale, S., White, C., Neiswanger, W.: Synthetic benchmarks for scientific research in explainable machine learning. In: Adv. Neural Inf. Proc. Syst. Datasets Track. (2021)

  44. Graziani, M., Lompech, T., Müller, H., Andrearczyk, V.: Evaluation and comparison of cnn visual explanations for histopathology. In: Proceedings of the AAAI Conference on Artificial Intelligence Workshops (XAI-AAAI-21), Virtual Event, pp. 8–9 (2021)

  45. Agarwal, C., Krishna, S., Saxena, E., Pawelczyk, M., Johnson, N., Puri, I., Zitnik, M., Lakkaraju, H.: Openxai: Towards a transparent evaluation of model explanations. Adv. Neural Inf. Proc. Syst. 35, 15784–15799 (2022)

    Google Scholar 

  46. Li, X., Xiong, H., Li, X., Wu, X., Chen, Z., Dou, D.: Interpretdl: explaining deep models in paddlepaddle. J. Mach. Learn. Res. 23(1), 8969–8974 (2022)

    MathSciNet  Google Scholar 

  47. Motallebi, M., Anik, M.T.A., Zaïane, O.R.: Explaining decisions of black-box models using barbe. In: Int. Conf. Database & Expert Syst. Appl. pp. 82–97 (2023). Springer

  48. Agarwal, C., Queen, O., Lakkaraju, H., Zitnik, M.: Evaluating explainability for graph neural networks. Scientific Data. 10(1), 144 (2023)

    Article  Google Scholar 

  49. Benabdeslem, K., Mansouri, D.E.K., Makkhongkaew, R.: scos: Semi-supervised co-selection by a similarity preserving approach. IEEE Trans. Knowl. Data Eng. 34(6), 2899–2911 (2022). https://doi.org/10.1109/TKDE.2020.3014262

    Article  Google Scholar 

  50. She, Y., Owen, A.-B.: Outlier detection using nonconvex penalized regression. CoRR. abs/1006.2592 (2010). arXiv:1006.2592

  51. Tong, H., Lin, C.: Non-negative residual matrix factorization with application to graph anomaly detection. In: Proceedings of the Eleventh SIAM International Conference on Data Mining, SDM 2011, April 28-30, 2011, Mesa, Arizona, USA, pp. 143–153. SIAM / Omnipress, ??? (2011)

  52. Tang, J., Liu, H.: Coselect: Feature selection with instance selection for social media data. In: Proceedings of the 13th SIAM International Conference on Data Mining, May 2-4, 2013. Austin, Texas, USA, pp. 695–703. SIAM, ??? (2013)

  53. Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., Gao, J.: Deep learning-based text classification. ACM Computing Surveys (CSUR). 54, 1–40 (2021)

    Article  Google Scholar 

  54. Linden, I.-V.-D., Haned, H., Kanoulas, E.: Global aggregations of local explanations for black box models. CoRR. abs/1907.03039, (2019). arXiv:1907.03039

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

1- Dou El Kefel Mansouri: Methodology, Software, Validation, Formal analysis, Investigation, Writing- Original draft preparation.

2- Seif-Eddine Benkabou: Conceptualization of this study, Methodology, Software, Validation, Formal analysis, Investigation, Writing- Original draft preparation.

3- Khaoula Meddahi: Conceptualization of this study, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing- Original draft preparation.

4- Allel Hadjali: Methodology, Investigation, Visualization.

5- Amin Mesmoudi: Software, Methodology, Investigation, Visualization.

6- Khalid Benabdeslem: Methodology, Investigation, Visualization.

7- Souleyman Chaib: Methodology, Investigation, Visualization.

**All authors reviewed the manuscript.

Corresponding author

Correspondence to Dou El Kefel Mansouri.

Ethics declarations

Conflicts of interest

The authors declare that they have no confict of interest.

Ethical Approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Web Information Systems Engineering 2022

Guest Editors: Richard Chbeir, Helen Huang, Yannis Manolopoulos and Fabrizio Silvestri.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mansouri, D.E.K., Benkabou, SE., Meddahi, K. et al. CoSP: co-selection pick for a global explainability of black box machine learning models. World Wide Web 26, 3965–3981 (2023). https://doi.org/10.1007/s11280-023-01213-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-023-01213-8

Keywords