research-article

Incorporation of Data-Mined Knowledge into Black-Box SVM for Interpretability

Authors:

Ping ZhangAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology, Volume 14, Issue 1

Article No.: 6, Pages 1 - 22

https://doi.org/10.1145/3548775

Published: 09 November 2022 Publication History

Abstract

The lack of interpretability often makes black-box models challenging to be applied in many practical domains. For this reason, the current work, from the black-box model input port, proposes to incorporate data-mined knowledge into the black-box soft-margin SVM model to enhance accuracy and interpretability. The concept and incorporation mechanism of data-mined knowledge are successively developed, based on which a partially interpretable soft-margin SVM (pTsm-SVM) optimization model is designed and then solved through reformulating the optimization problem as standard quadratic programming. An algorithm for mining linear positive (negative) class knowledge from general data sets is also proposed, which generates a linear two-dimensional discriminative rule with specificity (sensitivity) equal to 1 and the highest possible sensitivity (specificity) among all two-dimensional feature spaces. The knowledge-integrated pTsm-SVM works by achieving a good trade-off among the “large margin”, “high specificity”, and “high sensitivity”. Our experimental results on eight UCI datasets demonstrate the superiority of the proposed pTsm-SVM over the standard soft-margin SVM both in terms of accuracy and interpretability.

References

[1]

Yaser S. Abu-Mostafa. 1990. Learning from hints in neural networks. Journal of Complexity 6, 2 (1990), 192–198.

Digital Library

[2]

Ali Adeli and Mehdi Neshat. 2010. A fuzzy expert system for heart disease diagnosis. In Proceedings of International Multi Conference of Engineers and Computer Scientists, Hong Kong, Vol. 1. Citeseer, 28–30.

[3]

Luis Antonio Aguirre, Rafael A. M. Lopes, Gleison F. V. Amaral, and Christophe Letellier. 2004. Constraining the topology of neural networks to ensure dynamics with symmetry properties. Physical Review E 69, 2 (2004), 026701.

[4]

Nahla Barakat and Andrew P. Bradley. 2010. Rule extraction from support vector machines: A review. Neurocomputing 74, 1–3 (2010), 178–190.

Digital Library

[5]

Sumanta Basu, Karl Kumbier, James B. Brown, and Bin Yu. 2018. Iterative random forests to discover predictive and stable high-order interactions. Proceedings of the National Academy of Sciences 115, 8 (2018), 1943–1948.

[6]

Rafael V. Borges, Artur d’Avila Garcez, and Luis C. Lamb. 2011. Learning and representing temporal knowledge in recurrent networks. IEEE Transactions on Neural Networks 22, 12 (2011), 2409–2421.

Digital Library

[7]

Kenneth P. Burnham and David R. Anderson. 2004. Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods & Research 33, 2 (2004), 261–304.

[8]

Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1721–1730.

Digital Library

[9]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 3 (2011), 1–27.

Digital Library

[10]

Pai-Hsuen Chen, Chih-Jen Lin, and Bernhard Schölkopf. 2005. A tutorial on \(\nu\)-support vector machines. Applied Stochastic Models in Business and Industry 21, 2 (2005), 111–136.

Digital Library

[11]

Shaohan Chen and Chuanhou Gao. 2019. Linear priors mined and integrated for transparency of blast furnace black-box SVM model. IEEE Transactions on Industrial Informatics 16, 6 (2019), 3862–3870.

[12]

Sheng Chen, Andreas Wolfgang, Chris J. Harris, and Lajos Hanzo. 2008. Symmetric RBF classifier for nonlinear detection in multiple-antenna-aided systems. IEEE Transactions on Neural Networks 19, 5 (2008), 737–745.

Digital Library

[13]

Jan Chorowski and Jacek M. Zurada. 2011. Extracting rules from neural networks as decision diagrams. IEEE Transactions on Neural Networks 22, 12 (2011), 2435–2446.

Digital Library

[14]

Nello Cristianini and John Shawe-Taylor. 2000. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press.

[15]

Hennie Daniels and Marina Velikova. 2010. Monotone and partially monotone neural networks. IEEE Transactions on Neural Networks 21, 6 (2010), 906–917.

Digital Library

[16]

Charles Dugas, Yoshua Bengio, François Bélisle, Claude Nadeau, and René Garcia. 2009. Incorporating functional knowledge in neural networks. Journal of Machine Learning Research 10, 6 (2009).

[17]

Glenn Fung, Olvi Mangasarian, and Jude Shavlik. 2002. Knowledge-based support vector machine classifiers. Advances in Neural Information Processing Systems 15 (2002).

[18]

Chuanhou Gao, Ling Jian, and Shihua Luo. 2011. Modeling of the thermal state change of blast furnace hearth with support vector machines. IEEE Transactions on Industrial Electronics 59, 2 (2011), 1134–1145.

[19]

Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 80–89.

[20]

Miguel A. Goberna. 1998. Linear semi-infinite optimization. Mathematical Methods in Practice 2 (1998).

[21]

Daniel Grumiller, Robert McNees, and Simone Zonetti. 2012. Black holes in the conical ensemble. Physical Review D 86, 12 (2012), 124043.

[22]

Bin Guo, Hao Wang, Yasan Ding, Wei Wu, Shaoyang Hao, Yueqi Sun, and Zhiwen Yu. 2021. Conditional text generation for harmonious human-machine interaction. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 2 (2021), 1–50.

Digital Library

[23]

Md. Rezwanul Haque, Md. Milon Islam, Hasib Iqbal, Md. Sumon Reza, and Md. Kamrul Hasan. 2018. Performance evaluation of random forests and artificial neural networks for the classification of liver disorder. In 2018 International Conference on Computer, Communication, Chemical, Material and Electronic Engineering (IC4ME2). IEEE, 1–5.

[24]

Bao-Gang Hu, Yong Wang, Shuang-Hong Yang, and Han-Bing Qu. 2007. How to add transparency to artificial neural networks. Pattern Recognition and Artificial Intelligence 20, 1 (2007), 72–84.

[25]

Mingqing Hu, Yiqiang Chen, and James Tin-Yau Kwok. 2009. Building sparse multiple-kernel SVM classifiers. IEEE Transactions on Neural Networks 20, 5 (2009), 827–839.

Digital Library

[26]

Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, and Eric Xing. 2016. Deep neural networks with massive learned knowledge. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 1670–1679.

[27]

Thuan Q. Huynh and James A. Reggia. 2011. Guiding hidden layer representations for improved rule extraction from neural networks. IEEE Transactions on Neural Networks 22, 2 (2011), 264–275.

Digital Library

[28]

Been Kim, Rajiv Khanna, and Oluwasanmi O. Koyejo. 2016. Examples are not enough, learn to criticize! Criticism for interpretability. Advances in Neural Information Processing Systems 29 (2016).

[29]

Fabien Lauer and Gérard Bloch. 2008. Incorporating prior knowledge in support vector machines for classification: A review. Neurocomputing 71, 7–9 (2008), 1578–1594.

Digital Library

[30]

Fabien Lauer and Gérard Bloch. 2008. Incorporating prior knowledge in support vector regression. Machine Learning 70, 1 (2008), 89–118.

Digital Library

[31]

Xinwang Liu, Lei Wang, Jianping Yin, En Zhu, and Jian Zhang. 2013. An efficient approach to integrating radius information into multiple kernel learning. IEEE Transactions on Cybernetics 43, 2 (2013), 557–569.

[32]

Olvi L. Mangasarian. 1994. Nonlinear Programming. SIAM.

[33]

Olvi L. Mangasarian, Jude W. Shavlik, and Edward W. Wild. 2004. Knowledge-based kernel approximation. Journal of Machine Learning Research 5, Sep. (2004), 1127–1141.

Digital Library

[34]

Olvi L. Mangasarian and Edward W. Wild. 2007. Nonlinear knowledge in kernel approximation. IEEE Transactions on Neural Networks 18, 1 (2007), 300–306.

Digital Library

[35]

Olvi L. Mangasarian and Edward W. Wild. 2008. Nonlinear knowledge-based classification. IEEE Transactions on Neural Networks 19, 10 (2008), 1826–1832.

Digital Library

[36]

David Martens, B. B. Baesens, and Tony Van Gestel. 2008. Decompositional rule extraction from support vector machines by active learning. IEEE Transactions on Knowledge and Data Engineering 21, 2 (2008), 178–191.

Digital Library

[37]

D. R. McCane, Robert L. Hanson, Marie-Aline Charles, Lennart T. H. Jacobsson, D. Dj Pettitt, Peter H. Bennett, and William C. Knowler. 1994. Comparison of tests for glycated haemoglobin and fasting and two hour plasma glucose concentrations as diagnostic methods for diabetes. BMJ 308, 6940 (1994), 1323–1328.

[38]

Alexey Minin, Marina Velikova, Bernhard Lang, and Hennie Daniels. 2010. Comparison of universal approximators incorporating partial monotonicity by structure. Neural Networks 23, 4 (2010), 471–475.

Digital Library

[39]

W. James Murdoch, Chandan Singh, Karl Kumbier, Reza Abbasi-Asl, and Bin Yu. 2019. Definitions, methods, and applications in interpretable machine learning. Proceedings of the National Academy of Sciences 116, 44 (2019), 22071–22080.

[40]

Partha Niyogi, Federico Girosi, and Tomaso Poggio. 1998. Incorporating prior information in machine learning by creating virtual examples. Proc. IEEE 86, 11 (1998), 2196–2209.

[41]

Abner Louis Notkins. 1979. The causes of diabetes. Scientific American 241, 5 (1979), 62–73.

[42]

Tomaso Poggio and Federico Girosi. 1990. Networks for approximation and learning. Proc. IEEE 78, 9 (1990), 1481–1497.

[43]

Ya-Jun Qu and Bao-Gang Hu. 2011. Generalized constraint neural network regression model subject to linear priors. IEEE Transactions on Neural Networks 22, 12 (2011), 2447–2459.

Digital Library

[44]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1135–1144.

Digital Library

[45]

Bernhard Schölkopf, Alexander J. Smola, Francis Bach, et al. 2002. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.

Digital Library

[46]

Celia Shahnaz, Jubaer Hossain, Shaikh Anowarul Fattah, Shajib Ghosh, and Asir Intisar Khan. 2017. Efficient approaches for accuracy improvement of breast cancer classification using Wisconsin database. In 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC). IEEE, 792–797.

[47]

Ao Tang, Ke Lu, Yufei Wang, Jie Huang, and Houqiang Li. 2015. A real-time hand posture recognition system using deep neural networks. ACM Transactions on Intelligent Systems and Technology (TIST) 6, 2 (2015), 1–23.

Digital Library

[48]

Geoffrey G. Towell and Jude W. Shavlik. 1994. Knowledge-based artificial neural networks. Artificial Intelligence 70, 1–2 (1994), 119–165.

Digital Library

[49]

Vladimir Vapnik. 1999. The Nature of Statistical Learning Theory. Springer Science & Business Media.

[50]

Marina Velikova, Hennie Daniels, and Ad Feelders. 2006. Mixtures of monotone networks for prediction. International Journal of Computational Intelligence 3, 3 (2006), 205–214.

[51]

Tong Wang, Cynthia Rudin, Finale Doshi-Velez, Yimin Liu, Erica Klampfl, and Perry MacNeille. 2017. A Bayesian framework for learning rule sets for interpretable classification. The Journal of Machine Learning Research 18, 1 (2017), 2357–2393.

Digital Library

[52]

David H. Wolpert and William G. Macready. 1997. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1, 1 (1997), 67–82.

Digital Library

[53]

Mike Wu, Sonali Parbhoo, Michael C. Hughes, Volker Roth, and Finale Doshi-Velez. 2021. Optimizing for interpretability in deep neural networks with tree regularization. Journal of Artificial Intelligence Research 72 (2021), 1–37.

Digital Library

[54]

Xinxing Xu, Ivor W. Tsang, and Dong Xu. 2013. Soft margin multiple kernel learning. IEEE Transactions on Neural Networks and Learning Systems 24, 5 (2013), 749–761.

[55]

Jacob Yerushalmy. 1947. Statistical problems in assessing methods of medical diagnosis, with special reference to X-ray techniques. Public Health Reports (1896-1970) (1947), 1432–1449.

[56]

Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European Conference on Computer Vision. Springer, 818–833.

[57]

Jan Ruben Zilke, Eneldo Loza Mencía, and Frederik Janssen. 2016. DeepRED–rule extraction from deep neural networks. In International Conference on Discovery Science. Springer, 457–473.

Digital Library

Cited By

Sayeed MRahman ARahman ARois R(2024)On the interpretability of the SVM model for predicting infant mortality in BangladeshJournal of Health, Population and Nutrition10.1186/s41043-024-00646-943:1Online publication date: 27-Oct-2024
https://doi.org/10.1186/s41043-024-00646-9
Madhusudhanan SJose ASahoo JMalekian R(2024)PRIMϵ: Novel Privacy-Preservation Model With Pattern Mining and Genetic AlgorithmIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.332476919(571-585)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIFS.2023.3324769
Al-Akayleh FAl-Remawi MAli Agha A(2024)AI-Driven Physical Rehabilitation Strategies in Post-Cancer Care2024 2nd International Conference on Cyber Resilience (ICCR)10.1109/ICCR61006.2024.10532883(1-6)Online publication date: 26-Feb-2024
https://doi.org/10.1109/ICCR61006.2024.10532883
Show More Cited By

Index Terms

Incorporation of Data-Mined Knowledge into Black-Box SVM for Interpretability
1. Theory of computation
  1. Design and analysis of algorithms
    1. Mathematical optimization
      1. Continuous optimization
        Convex optimization

Recommendations

Interpretability assessment of fuzzy knowledge bases

Computing with words (CWW) relies on linguistic representation of knowledge that is processed by operating at the semantical level defined through fuzzy sets. Linguistic representation of knowledge is a major issue when fuzzy rule based models are ...
Knowledge Discovery and Data Visualization: Theories and Perspectives

This article reviews the literature in the search for the theories and perspectives of knowledge discovery and data visualization. The literature review highlights the overview of knowledge discovery; Knowledge Discovery in Databases KDD; Knowledge ...
Introducing elitist black-box models: When does elitist behavior weaken the performance of evolutionary algorithms?

Black-box complexity theory provides lower bounds for the runtime of black-box optimizers like evolutionary algorithms and other search heuristics and serves as an inspiration for the design of new genetic algorithms. Several black-box models covering ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 14, Issue 1

February 2023

487 pages

ISSN:2157-6904

EISSN:2157-6912

DOI:10.1145/3570136

Editor:
Huan Liu
Arizona State University, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 November 2022

Online AM: 14 July 2022

Accepted: 05 July 2022

Revised: 09 March 2022

Received: 06 August 2021

Published in TIST Volume 14, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Natural Science Foundation of China
Zhejiang Provincial Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
304
Total Downloads

Downloads (Last 12 months)43
Downloads (Last 6 weeks)4

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sayeed MRahman ARahman ARois R(2024)On the interpretability of the SVM model for predicting infant mortality in BangladeshJournal of Health, Population and Nutrition10.1186/s41043-024-00646-943:1Online publication date: 27-Oct-2024
https://doi.org/10.1186/s41043-024-00646-9
Madhusudhanan SJose ASahoo JMalekian R(2024)PRIMϵ: Novel Privacy-Preservation Model With Pattern Mining and Genetic AlgorithmIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.332476919(571-585)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIFS.2023.3324769
Al-Akayleh FAl-Remawi MAli Agha A(2024)AI-Driven Physical Rehabilitation Strategies in Post-Cancer Care2024 2nd International Conference on Cyber Resilience (ICCR)10.1109/ICCR61006.2024.10532883(1-6)Online publication date: 26-Feb-2024
https://doi.org/10.1109/ICCR61006.2024.10532883
Wang NLin XLiu XZhang ZYang Y(2024)Research on Intelligent Semantic Retrieval System for Large-Scale Biomedical Texts2024 IEEE 2nd International Conference on Control, Electronics and Computer Technology (ICCECT)10.1109/ICCECT60629.2024.10546060(427-430)Online publication date: 26-Apr-2024
https://doi.org/10.1109/ICCECT60629.2024.10546060
Rahimzadeh GPławiak PMohamed SLacy KNahavandi DTadeusiewicz RAsadi H(2024)Significance of Physiological Signal Thresholds in the Early Diagnosis of Simulator SicknessIEEE Access10.1109/ACCESS.2024.346792012(141685-141704)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3467920
Yang LChen YMei LHao YLi LHuang HZhang YZhang LYang L(2024)Prediction method for response characteristics parameters of isolated-span overhead lines after ice-shedding based on finite element simulation and machine learningElectric Power Systems Research10.1016/j.epsr.2024.110141229(110141)Online publication date: Apr-2024
https://doi.org/10.1016/j.epsr.2024.110141
Cao GYang LNi P(2022)Electroencephalogram Emotion Recognition Based on Individual Frontal Asymmetry Hypothesis2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM55620.2022.9995216(1994-2001)Online publication date: 6-Dec-2022
https://doi.org/10.1109/BIBM55620.2022.9995216
Han SJin ZDeji DHan TZhang YFeng MHasi W(2022)Study on the classification and identification of various carbonate and sulfate mineral medicines based on Raman spectroscopy combined with PCA-SVM algorithmAnalytical Sciences10.1007/s44211-022-00224-139:2(241-248)Online publication date: 16-Dec-2022
https://doi.org/10.1007/s44211-022-00224-1

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents