skip to main content
research-article

Incorporation of Data-Mined Knowledge into Black-Box SVM for Interpretability

Published: 09 November 2022 Publication History

Abstract

The lack of interpretability often makes black-box models challenging to be applied in many practical domains. For this reason, the current work, from the black-box model input port, proposes to incorporate data-mined knowledge into the black-box soft-margin SVM model to enhance accuracy and interpretability. The concept and incorporation mechanism of data-mined knowledge are successively developed, based on which a partially interpretable soft-margin SVM (pTsm-SVM) optimization model is designed and then solved through reformulating the optimization problem as standard quadratic programming. An algorithm for mining linear positive (negative) class knowledge from general data sets is also proposed, which generates a linear two-dimensional discriminative rule with specificity (sensitivity) equal to 1 and the highest possible sensitivity (specificity) among all two-dimensional feature spaces. The knowledge-integrated pTsm-SVM works by achieving a good trade-off among the “large margin”, “high specificity”, and “high sensitivity”. Our experimental results on eight UCI datasets demonstrate the superiority of the proposed pTsm-SVM over the standard soft-margin SVM both in terms of accuracy and interpretability.

References

[1]
Yaser S. Abu-Mostafa. 1990. Learning from hints in neural networks. Journal of Complexity 6, 2 (1990), 192–198.
[2]
Ali Adeli and Mehdi Neshat. 2010. A fuzzy expert system for heart disease diagnosis. In Proceedings of International Multi Conference of Engineers and Computer Scientists, Hong Kong, Vol. 1. Citeseer, 28–30.
[3]
Luis Antonio Aguirre, Rafael A. M. Lopes, Gleison F. V. Amaral, and Christophe Letellier. 2004. Constraining the topology of neural networks to ensure dynamics with symmetry properties. Physical Review E 69, 2 (2004), 026701.
[4]
Nahla Barakat and Andrew P. Bradley. 2010. Rule extraction from support vector machines: A review. Neurocomputing 74, 1–3 (2010), 178–190.
[5]
Sumanta Basu, Karl Kumbier, James B. Brown, and Bin Yu. 2018. Iterative random forests to discover predictive and stable high-order interactions. Proceedings of the National Academy of Sciences 115, 8 (2018), 1943–1948.
[6]
Rafael V. Borges, Artur d’Avila Garcez, and Luis C. Lamb. 2011. Learning and representing temporal knowledge in recurrent networks. IEEE Transactions on Neural Networks 22, 12 (2011), 2409–2421.
[7]
Kenneth P. Burnham and David R. Anderson. 2004. Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods & Research 33, 2 (2004), 261–304.
[8]
Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1721–1730.
[9]
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 3 (2011), 1–27.
[10]
Pai-Hsuen Chen, Chih-Jen Lin, and Bernhard Schölkopf. 2005. A tutorial on \(\nu\)-support vector machines. Applied Stochastic Models in Business and Industry 21, 2 (2005), 111–136.
[11]
Shaohan Chen and Chuanhou Gao. 2019. Linear priors mined and integrated for transparency of blast furnace black-box SVM model. IEEE Transactions on Industrial Informatics 16, 6 (2019), 3862–3870.
[12]
Sheng Chen, Andreas Wolfgang, Chris J. Harris, and Lajos Hanzo. 2008. Symmetric RBF classifier for nonlinear detection in multiple-antenna-aided systems. IEEE Transactions on Neural Networks 19, 5 (2008), 737–745.
[13]
Jan Chorowski and Jacek M. Zurada. 2011. Extracting rules from neural networks as decision diagrams. IEEE Transactions on Neural Networks 22, 12 (2011), 2435–2446.
[14]
Nello Cristianini and John Shawe-Taylor. 2000. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press.
[15]
Hennie Daniels and Marina Velikova. 2010. Monotone and partially monotone neural networks. IEEE Transactions on Neural Networks 21, 6 (2010), 906–917.
[16]
Charles Dugas, Yoshua Bengio, François Bélisle, Claude Nadeau, and René Garcia. 2009. Incorporating functional knowledge in neural networks. Journal of Machine Learning Research 10, 6 (2009).
[17]
Glenn Fung, Olvi Mangasarian, and Jude Shavlik. 2002. Knowledge-based support vector machine classifiers. Advances in Neural Information Processing Systems 15 (2002).
[18]
Chuanhou Gao, Ling Jian, and Shihua Luo. 2011. Modeling of the thermal state change of blast furnace hearth with support vector machines. IEEE Transactions on Industrial Electronics 59, 2 (2011), 1134–1145.
[19]
Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 80–89.
[20]
Miguel A. Goberna. 1998. Linear semi-infinite optimization. Mathematical Methods in Practice 2 (1998).
[21]
Daniel Grumiller, Robert McNees, and Simone Zonetti. 2012. Black holes in the conical ensemble. Physical Review D 86, 12 (2012), 124043.
[22]
Bin Guo, Hao Wang, Yasan Ding, Wei Wu, Shaoyang Hao, Yueqi Sun, and Zhiwen Yu. 2021. Conditional text generation for harmonious human-machine interaction. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 2 (2021), 1–50.
[23]
Md. Rezwanul Haque, Md. Milon Islam, Hasib Iqbal, Md. Sumon Reza, and Md. Kamrul Hasan. 2018. Performance evaluation of random forests and artificial neural networks for the classification of liver disorder. In 2018 International Conference on Computer, Communication, Chemical, Material and Electronic Engineering (IC4ME2). IEEE, 1–5.
[24]
Bao-Gang Hu, Yong Wang, Shuang-Hong Yang, and Han-Bing Qu. 2007. How to add transparency to artificial neural networks. Pattern Recognition and Artificial Intelligence 20, 1 (2007), 72–84.
[25]
Mingqing Hu, Yiqiang Chen, and James Tin-Yau Kwok. 2009. Building sparse multiple-kernel SVM classifiers. IEEE Transactions on Neural Networks 20, 5 (2009), 827–839.
[26]
Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, and Eric Xing. 2016. Deep neural networks with massive learned knowledge. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 1670–1679.
[27]
Thuan Q. Huynh and James A. Reggia. 2011. Guiding hidden layer representations for improved rule extraction from neural networks. IEEE Transactions on Neural Networks 22, 2 (2011), 264–275.
[28]
Been Kim, Rajiv Khanna, and Oluwasanmi O. Koyejo. 2016. Examples are not enough, learn to criticize! Criticism for interpretability. Advances in Neural Information Processing Systems 29 (2016).
[29]
Fabien Lauer and Gérard Bloch. 2008. Incorporating prior knowledge in support vector machines for classification: A review. Neurocomputing 71, 7–9 (2008), 1578–1594.
[30]
Fabien Lauer and Gérard Bloch. 2008. Incorporating prior knowledge in support vector regression. Machine Learning 70, 1 (2008), 89–118.
[31]
Xinwang Liu, Lei Wang, Jianping Yin, En Zhu, and Jian Zhang. 2013. An efficient approach to integrating radius information into multiple kernel learning. IEEE Transactions on Cybernetics 43, 2 (2013), 557–569.
[32]
Olvi L. Mangasarian. 1994. Nonlinear Programming. SIAM.
[33]
Olvi L. Mangasarian, Jude W. Shavlik, and Edward W. Wild. 2004. Knowledge-based kernel approximation. Journal of Machine Learning Research 5, Sep. (2004), 1127–1141.
[34]
Olvi L. Mangasarian and Edward W. Wild. 2007. Nonlinear knowledge in kernel approximation. IEEE Transactions on Neural Networks 18, 1 (2007), 300–306.
[35]
Olvi L. Mangasarian and Edward W. Wild. 2008. Nonlinear knowledge-based classification. IEEE Transactions on Neural Networks 19, 10 (2008), 1826–1832.
[36]
David Martens, B. B. Baesens, and Tony Van Gestel. 2008. Decompositional rule extraction from support vector machines by active learning. IEEE Transactions on Knowledge and Data Engineering 21, 2 (2008), 178–191.
[37]
D. R. McCane, Robert L. Hanson, Marie-Aline Charles, Lennart T. H. Jacobsson, D. Dj Pettitt, Peter H. Bennett, and William C. Knowler. 1994. Comparison of tests for glycated haemoglobin and fasting and two hour plasma glucose concentrations as diagnostic methods for diabetes. BMJ 308, 6940 (1994), 1323–1328.
[38]
Alexey Minin, Marina Velikova, Bernhard Lang, and Hennie Daniels. 2010. Comparison of universal approximators incorporating partial monotonicity by structure. Neural Networks 23, 4 (2010), 471–475.
[39]
W. James Murdoch, Chandan Singh, Karl Kumbier, Reza Abbasi-Asl, and Bin Yu. 2019. Definitions, methods, and applications in interpretable machine learning. Proceedings of the National Academy of Sciences 116, 44 (2019), 22071–22080.
[40]
Partha Niyogi, Federico Girosi, and Tomaso Poggio. 1998. Incorporating prior information in machine learning by creating virtual examples. Proc. IEEE 86, 11 (1998), 2196–2209.
[41]
Abner Louis Notkins. 1979. The causes of diabetes. Scientific American 241, 5 (1979), 62–73.
[42]
Tomaso Poggio and Federico Girosi. 1990. Networks for approximation and learning. Proc. IEEE 78, 9 (1990), 1481–1497.
[43]
Ya-Jun Qu and Bao-Gang Hu. 2011. Generalized constraint neural network regression model subject to linear priors. IEEE Transactions on Neural Networks 22, 12 (2011), 2447–2459.
[44]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1135–1144.
[45]
Bernhard Schölkopf, Alexander J. Smola, Francis Bach, et al. 2002. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.
[46]
Celia Shahnaz, Jubaer Hossain, Shaikh Anowarul Fattah, Shajib Ghosh, and Asir Intisar Khan. 2017. Efficient approaches for accuracy improvement of breast cancer classification using Wisconsin database. In 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC). IEEE, 792–797.
[47]
Ao Tang, Ke Lu, Yufei Wang, Jie Huang, and Houqiang Li. 2015. A real-time hand posture recognition system using deep neural networks. ACM Transactions on Intelligent Systems and Technology (TIST) 6, 2 (2015), 1–23.
[48]
Geoffrey G. Towell and Jude W. Shavlik. 1994. Knowledge-based artificial neural networks. Artificial Intelligence 70, 1–2 (1994), 119–165.
[49]
Vladimir Vapnik. 1999. The Nature of Statistical Learning Theory. Springer Science & Business Media.
[50]
Marina Velikova, Hennie Daniels, and Ad Feelders. 2006. Mixtures of monotone networks for prediction. International Journal of Computational Intelligence 3, 3 (2006), 205–214.
[51]
Tong Wang, Cynthia Rudin, Finale Doshi-Velez, Yimin Liu, Erica Klampfl, and Perry MacNeille. 2017. A Bayesian framework for learning rule sets for interpretable classification. The Journal of Machine Learning Research 18, 1 (2017), 2357–2393.
[52]
David H. Wolpert and William G. Macready. 1997. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1, 1 (1997), 67–82.
[53]
Mike Wu, Sonali Parbhoo, Michael C. Hughes, Volker Roth, and Finale Doshi-Velez. 2021. Optimizing for interpretability in deep neural networks with tree regularization. Journal of Artificial Intelligence Research 72 (2021), 1–37.
[54]
Xinxing Xu, Ivor W. Tsang, and Dong Xu. 2013. Soft margin multiple kernel learning. IEEE Transactions on Neural Networks and Learning Systems 24, 5 (2013), 749–761.
[55]
Jacob Yerushalmy. 1947. Statistical problems in assessing methods of medical diagnosis, with special reference to X-ray techniques. Public Health Reports (1896-1970) (1947), 1432–1449.
[56]
Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European Conference on Computer Vision. Springer, 818–833.
[57]
Jan Ruben Zilke, Eneldo Loza Mencía, and Frederik Janssen. 2016. DeepRED–rule extraction from deep neural networks. In International Conference on Discovery Science. Springer, 457–473.

Cited By

View all
  • (2024)On the interpretability of the SVM model for predicting infant mortality in BangladeshJournal of Health, Population and Nutrition10.1186/s41043-024-00646-943:1Online publication date: 27-Oct-2024
  • (2024)PRIMϵ: Novel Privacy-Preservation Model With Pattern Mining and Genetic AlgorithmIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.332476919(571-585)Online publication date: 1-Jan-2024
  • (2024)AI-Driven Physical Rehabilitation Strategies in Post-Cancer Care2024 2nd International Conference on Cyber Resilience (ICCR)10.1109/ICCR61006.2024.10532883(1-6)Online publication date: 26-Feb-2024
  • Show More Cited By

Index Terms

  1. Incorporation of Data-Mined Knowledge into Black-Box SVM for Interpretability

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 14, Issue 1
    February 2023
    487 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/3570136
    • Editor:
    • Huan Liu
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 November 2022
    Online AM: 14 July 2022
    Accepted: 05 July 2022
    Revised: 09 March 2022
    Received: 06 August 2021
    Published in TIST Volume 14, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Knowledge
    2. data-mined
    3. black-box
    4. interpretability
    5. soft-margin SVM

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • National Natural Science Foundation of China
    • Zhejiang Provincial Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)43
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)On the interpretability of the SVM model for predicting infant mortality in BangladeshJournal of Health, Population and Nutrition10.1186/s41043-024-00646-943:1Online publication date: 27-Oct-2024
    • (2024)PRIMϵ: Novel Privacy-Preservation Model With Pattern Mining and Genetic AlgorithmIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.332476919(571-585)Online publication date: 1-Jan-2024
    • (2024)AI-Driven Physical Rehabilitation Strategies in Post-Cancer Care2024 2nd International Conference on Cyber Resilience (ICCR)10.1109/ICCR61006.2024.10532883(1-6)Online publication date: 26-Feb-2024
    • (2024)Research on Intelligent Semantic Retrieval System for Large-Scale Biomedical Texts2024 IEEE 2nd International Conference on Control, Electronics and Computer Technology (ICCECT)10.1109/ICCECT60629.2024.10546060(427-430)Online publication date: 26-Apr-2024
    • (2024)Significance of Physiological Signal Thresholds in the Early Diagnosis of Simulator SicknessIEEE Access10.1109/ACCESS.2024.346792012(141685-141704)Online publication date: 2024
    • (2024)Prediction method for response characteristics parameters of isolated-span overhead lines after ice-shedding based on finite element simulation and machine learningElectric Power Systems Research10.1016/j.epsr.2024.110141229(110141)Online publication date: Apr-2024
    • (2022)Electroencephalogram Emotion Recognition Based on Individual Frontal Asymmetry Hypothesis2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)10.1109/BIBM55620.2022.9995216(1994-2001)Online publication date: 6-Dec-2022
    • (2022)Study on the classification and identification of various carbonate and sulfate mineral medicines based on Raman spectroscopy combined with PCA-SVM algorithmAnalytical Sciences10.1007/s44211-022-00224-139:2(241-248)Online publication date: 16-Dec-2022

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media