Skip to main content
Log in

OnML: an ontology-based approach for interpretable machine learning

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

In this paper, we introduce a novel interpreting framework that learns an interpretable model based on an ontology-based sampling technique to explain agnostic prediction models. Different from existing approaches, our interpretable algorithm considers contextual correlation among words, described in domain knowledge ontologies, to generate semantic explanations. To narrow down the search space for explanations, which is exponentially large given long and complicated text data, we design a learnable anchor algorithm to better extract local and domain knowledge-oriented explanations. A set of regulations is further introduced, combining learned interpretable representations with anchors and information extraction to generate comprehensible semantic explanations. To carry out an extensive experiment, we first develop a drug abuse ontology (DAO) on a drug abuse dataset on the Twittersphere, and a consumer complaint ontology (ConsO) on a consumer complaint dataset, especially for interpretable ML. Our experimental results show that our approach generates more precise and more insightful explanations compared with a variety of baseline approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availibility

Data is available at https://www.consumerfinance.gov/data-research/consumer-complaints/2

Notes

  1. https://www.consumerfinance.gov/data-research/consumer-complaints/

  2. https://github.com/PhungLai728/OnML

References

  • Adhikari A, Tax DM, Satta R, Fath M (2018) Example and feature importance-based explanations for black-box machine learning models. arXiv preprint arXiv:1812.09044

  • Arras L, Horn F, Montavon G, Müller K, Samek W (2016) “what is relevant in a text document?”: an interpretable machine learning approach. CoRR arXiv:1612.07843

  • Arras L, Horn F, Montavon G, Müller KR, Samek W (2017) “What is relevant in a text document?”:An interpretable machine learning approach. PloS one 12(8):e0181142

  • Bach S, Binder A, Montavon G, Klauschen F, Müller KR, Samek W (2015) On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10(7):e0130140

    Article  Google Scholar 

  • Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  • Banko M, Cafarella MJ, Soderland S, Broadhead M, Etzioni O (2007) Open information extraction from the web. IJCAI 7:2670–2676

    Google Scholar 

  • Barlas S (2013) Prescription drug abuse hits hospitals hard: Tighter federal steps aim to deflate crisis. Pharm Therapeut 38(9):531

    Google Scholar 

  • Confalonieri R, delPrado FM, Agramunt S, Malagarriga D, Faggion D, Weyde T, Besold TR (2019) An ontology-based approach to explaining artificial neural networks. arXiv preprint arXiv:1906.08362

  • Cong D, Zhao Y, Qin B, Han Y, Zhang M, Liu A, Chen N (2019) Hierarchical attention based neural network for explainable recommendation. In: Proceedings of the 2019 on international conference on multimedia retrieval, association for computing machinery, New York, NY, USA, ICMR ’19, p 373–381, https://doi.org/10.1145/3323873.3326592

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  • Fader A, Soderland S, Etzioni O (2011) Identifying relations for open information extraction. In: EMNLP, pp 1535–1545

  • Fong RC, Vedaldi A (2017) Interpretable explanations of black boxes by meaningful perturbation. In: ICCV, pp 3429–3437

  • Forgy EW (1965) Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21:768–769

    Google Scholar 

  • Gao S, Young MT, Qiu JX, Yoon HJ, Christian JB, Fearn PA, Tourassi GD, Ramanthan A (2018) Hierarchical attention networks for information extraction from cancer pathology reports. J Am Med Info Associat 25(3):321–330

    Article  Google Scholar 

  • Goyal Y, Khot T, Summers-Stay D, Batra D, Parikh D (2017) Making the V in VQA matter: elevating the role of image understanding in visual question answering. In: CVPR, pp 6904–6913

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Computat 9(8):1735–1780

    Article  Google Scholar 

  • H, Phan N, Geller J, Iezzi S, Vo H, Dou D, Chun SA (2019) An ensemble deep learning model for drug abuse detection in sparse twitter-sphere. In: MEDINFO’19)

  • Jia Y, Bailey J, Ramamohanarao K, Leckie C, Houle ME (2019) Improving the quality of explanations with local embedding perturbations. In: Proceedings of the 25th ACM SIGKDD International conference on knowledge discovery & Data Mining, pp 875–884

  • Kingma DP, Ba J (2014) ADAM: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  • Knublauch H, Fergerson RW, Noy NF, Musen MA (2004) The protégé owl plugin: an open development environment for semantic web applications. In: ISWC, Springer, pp 229–243

  • Lai P, Phan N, Hu H, Badeti A, Newman D, Dou D (2020) Ontology-based interpretable machine learning for textual data. IJCNN

  • Lécué F, Wu J (2018) Semantic explanations of predictions. arXiv preprint arXiv:1805.10587

  • Lipovetsky S, Conklin M (2001) Analysis of regression in game theory approach. Appl Stocha Model Busin Ind 17(4):319–330

    Article  MathSciNet  Google Scholar 

  • Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Advances in neural information processing systems, pp 4765–4774

  • Ma S, Tourani R (2020) Predictive and causal implications of using shapley value for model interpretation. In: Proceedings of the 2020 KDD workshop on causal discovery, PMLR, Proceedings of Machine Learning Research, vol 127, pp 23–38, https://proceedings.mlr.press/v127/ma20a.html

  • Martens D, Provost F (2014) Explaining data-driven document classifications. MIS Quarter 38(1):73–100

    Article  Google Scholar 

  • Martens D, Baesens B, Van GT, Vanthienen J (2007) Comprehensible credit scoring models using rule extraction from support vector machines. EJOR 183(3):1466–1476

    Article  Google Scholar 

  • Martins A, Astudillo R (2016) From softmax to sparsemax: A sparse model of attention and multi-label classification. PMLR, New York, New York, USA, Proceedings of Machine Learning Research, vol 48, pp 1614–1623, http://proceedings.mlr.press/v48/martins16.html

  • Mikolov T, Chen K, Corrado GS, Dean JA (2015) Computing numeric representations of words in a high-dimensional space. US Patent 9,037,464

  • Mullenbach J, Wiegreffe S, Duke J, Sun J, Eisenstein J (2018) Explainable prediction of medical codes from clinical text. CoRR arXiv:1802.05695

  • Nagrecha S, Dillon JZ, Chawla NV (2017) Mooc dropout prediction: lessons learned from making pipelines interpretable. In: WWW, pp 351–359

  • Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  • Phan H, Dou D, Wang H, Kil D, Piniewski B (2017) Ontology-based deep learning for human behavior prediction with explanations in health social networks. Infor Sci 384:298–313

    Article  Google Scholar 

  • Ramos J et al (2003) Using TF-IDF to determine word relevance in document queries. iCML 242:133–142

    Google Scholar 

  • Ribeiro MT, Singh S, Guestrin C (2016) Why should i trust you?: explaining the predictions of any classifier. In: KDD, pp 1135–1144

  • Ribeiro MT, Singh S, Guestrin C (2018) Anchors: high-precision model-agnostic explanations. In: AAAI

  • Robnik SM, Kononenko I (2008) Explaining classifications for individual instances. TKDE 20(5):589–600

    Google Scholar 

  • Samek W, Wiegand T, Müller KR (2017) 2017. Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296

  • Schmitz M, Bart R, Soderland S, Etzioni O, et al. (2012) Open language learning for information extraction. In: EMNLP-IJCNLP, pp 523–534

  • Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: ICCV, pp 618–626

  • Shapley LS (1951) Notes on the n-person game-II: the value of an n-person game. RAND Corporation, Santa Monica, CA

    Google Scholar 

  • Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. In: ICML, pp 3145–3153

  • Soderland S, Roof B, Qin B, Xu S, Etzioni O et al (2010) Adapting open information extraction to domain-specific relations. AI Magazine 31(3):93–102

    Article  Google Scholar 

  • Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806

  • Štrumbelj E, Kononenko I (2014) Explaining prediction models and individual predictions with feature contributions. Knowl Info Sys 41(3):647–665

    Article  Google Scholar 

  • Sundararajan M, Taly A, Yan Q (2016) Gradients of counterfactuals. arXiv preprint arXiv:1611.02639

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: advances in Neural Information Processing Systems, pp 5998–6008

  • Wu F, Weld DS (2010) Open information extraction using wikipedia. In: ACL, pp 118–127

  • Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. PMLR, Lille, France, Proceedings of Machine Learning Research, vol 37, pp 2048–2057, http://proceedings.mlr.press/v37/xuc15.html

  • Yang C, Zhou W, Wang Z, Jiang B, Li D, Shen H (2021) Accurate and explainable recommendation via hierarchical attention network oriented towards crowd intelligence. Knowl-Based Sys 213:106687

    Article  Google Scholar 

  • Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480–1489

Download references

Acknowledgements

The authors gratefully acknowledge the support from the National Science Foundation (NSF) Grant Nos. CNS-1747798, CNS-1850094, and NSF Center for Big Learning (Oregon). We thank for the financial support from Wells Fargo. We also thank Nisansa de Silva (University of Oregon, USA) for valuable discussions and Anuja Badeti (New Jersey Institute of Technology, USA) for helping with data processing. Pelin Ayranci and Phung Lai are co-first authors.

Author information

Authors and Affiliations

Authors

Contributions

PA Design and implement the algorithms, and write the paper. PL Design and implement the algorithms, and write the paper. NP Ensured that the descriptions are accurate and agreed by all authors, Provide very detailed supervision of this work. DN Provide very detailed supervision of this work. A K provide very detailed supervision of this work. DD Provide very detailed supervision of this work. HH Data Curation, Processing, and Analysis.

Corresponding authors

Correspondence to Pelin Ayranci or Phung Lai.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Code availability

Code is available at https://github.com/PhungLai728/OnML

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

An earlier version of this paper was presented at the IJCNN Conference.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ayranci, P., Lai, P., Phan, N. et al. OnML: an ontology-based approach for interpretable machine learning. J Comb Optim 44, 770–793 (2022). https://doi.org/10.1007/s10878-022-00856-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-022-00856-z

Keywords

Navigation