Encoding and decoding the knowledge of association rules over SVM classification trees

Pang, Shaoning; Kasabov, Nikola

doi:10.1007/s10115-008-0147-1

Encoding and decoding the knowledge of association rules over SVM classification trees

Regular Paper
Published: 24 June 2008

Volume 19, pages 79–105, (2009)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Shaoning Pang¹ &
Nikola Kasabov¹

227 Accesses
17 Citations
Explore all metrics

Abstract

This paper presents a constructive method for association rule extraction, where the knowledge of data is encoded into an SVM classification tree (SVMT), and linguistic association rule is extracted by decoding of the trained SVMT. The method of rule extraction over the SVMT (SVMT-rule), in the spirit of decision-tree rule extraction, achieves rule extraction not only from SVM, but also over the decision-tree structure of SVMT. Thus, the obtained rules from SVMT-rule have the better comprehensibility of decision-tree rule, meanwhile retains the good classification accuracy of SVM. Moreover, profiting from the super generalization ability of SVMT owing to the aggregation of a group of SVMs, the SVMT-rule is capable of performing a very robust classification on such datasets that have seriously, even overwhelmingly, class-imbalanced data distribution. Experiments with a Gaussian synthetic data, seven benchmark cancers diagnosis, and one application of cell-phone fraud detection have highlighted the utility of SVMT and SVMT-rule on comprehensible and effective knowledge discovery, as well as the superior properties of SVMT-rule as compared to a purely support-vector based rule extraction. (A version of SVMT Matlab software is available online at http://kcir.kedri.info)

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Nunez H, Angulo C, Catala A (2002) Rule-extraction from support vector Machines. In: The European symposiumon aritificial neural networks, Burges, pp 107–112
Zhang Y, Su HY, Jia T, Chu J (2005) Rule extraction from trained support vector machines, PAKDD 2005, LANI3518. Springer, Heidelberg, pp 61–70
Wang L, Fu X (2005) Rule extraction from support vector machine. In: Data mining with computational intelligence, nced information and knowlegde processing. Springer, Berlin
Barakat N, Bradley AP (2006) Rule extraction from support vector machines: measuring the explanation capability using the area under the ROC curve. In: The 18th international conference on pattern recognition (ICPR’06), August, 2006, Hong Kong
Fung G, Sandilya S, Rao B (2005) Rule extraction for linear support vector machines, KDD2005, August 21–24, 2005, Chicago
Fu X, Ong C, Keerthi S, Huang GG, Goh L (2004) Extracting the knowledge embedded in support vector machines. In: Proceedings of IEEE international joint conference on neural networks, vol 1, no 25–29 July 2004, pp 291–296
Vapnik V (1982) Estimation of dependences based on empirical data. Springer, Heidelberg
MATH Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Spinger, Heidelberg
MATH Google Scholar
Cortes C, Vapnik V (1995) Support vector network. Mach Learning 20: 273–297
MATH Google Scholar
Pang S, Ozawa S, Kasabov N (2004) One-pass incremental membership authentication by face classification. ICBA 2004, LNCS, vol 3072. Springer, Heidelberg, pp 155–161
Pang S, Kim D, Bang SY (2003) Membership authentication in the dynamic group by face classification using SVM ensemble. Patt Recogn Lett 24: 215–225
Article MATH Google Scholar
Pang S (2005) SVM aggregation: SVM, SVM ensemble, SVM classification tree, IEEE SMC eNewsletter Dec. 2005. http://www.ieeesmc.org/Newsletter/Dec2005/R11Pang.php
Pang S, Kim D, Bang SY (2005) Face membership authentication using svm classification tree generated by membership-based LLE data partition. IEEE Trans Neural Netw 16(2): 436–446
Article Google Scholar
Schölkopf JC, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC (1999) Estimating the support of a high-dimensional distribution. Technical report, Microsoft Research, MSR-TR-99-87
Tax DMJ (2001) One-class classification, concept-learning in the absence of counter-examples. PhD Thesis
Tax DMJ, Duin RPW (2001) Combining one-class classifiers. LNCS 2096: 299–308
MathSciNet Google Scholar
Xu Y, Brereton RG (2005) Diagnostic pattern recognition on gene expression profile data by using one-class classifiers. J Chem Inf Model 45: 1392–1401
Article Google Scholar
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Google Scholar
Kim H-C, Pang S, Je H-M, Kim D, Yang Bang S (2003) Constructing support vector machine ensemble. Patt Recogn 36(12): 2757–2767
Article MATH Google Scholar
Shipp MA, Ross KN et al (2002) Supplementary information for diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8(1): 68–74
Article Google Scholar
Golub TR (2004) Toward a functional taxonomy of cancer. Cancer Cell 6(2): 107–8
Article MathSciNet Google Scholar
Pomeroy S, Tamayo P et al (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870): 436–442
Article Google Scholar
Alon U, Barkai N et al (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 8: 6745–6750
Article Google Scholar
Petricoin EF, Ardekani AM et al (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359: 572–577
Article Google Scholar
Van’t Veer LJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415: 530–536
Article Google Scholar
Gordon GJ, Jensen R et al (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62: 4963–4967
Google Scholar
Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Stat Sin 12: 111–139
MATH MathSciNet Google Scholar
Schuster A, Wolff R, Trock D (2005) A high-performance distributed algorithm for mining association rules. Knowl Inf Syst 7: 458–475
Article Google Scholar
Kam Ho T (1998) The random subspace method for constructing decision forests Tin Kam Ho. IEEE Trans Patt Anal Mach Intell 20(8): 832–844
Article Google Scholar
NeuCom—A Neuro-computing Decision Support Enviroment, Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, http://www.theneucom.com
Nez H, Angulo C, Catal A (2003) Hybrid Architecture based on support vector machines. In: Book Computational Methods in Neural Modeling Lecture Notes in Computer Science, vol 2686, pp 646–653
Zhou ZH, Jiang Y (2003) Medical diagnosis with C4.5 rule preceded by artificial neural netowrk ensemble. IEEE Trans Inf Technol Biomed 7(1): 37–42
Article MathSciNet Google Scholar
Chen Y, Wang JZ (2003) Support vector learning for fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 11(6): 716–728
Article Google Scholar
Nunez H, Angulo C, Catala A (2002) Support vector machines with symbolic interpretation. In: Proceedings of VII Brazilian symposium on neural networks, 11–14 Nov. 2002, pp 142–147
Duch W, Setiono R, Zurada JM (2004) Computational intelligence methods for rule-based data understanding. Proc IEEE 92(5): 771–805
Article Google Scholar
Pang S, Kim D, Bang SY (2001) Fraud detection using support vector machine ensemble. ICONIP2001, Shanghai, China
Terabe M, Washio T, Motoda H, Katai O, Sawaragi T (2002) Attribute generation based on association rules. Knowl Inf Syst 4: 329–349
Article Google Scholar
Barakat N, Diederich J (2004) Learning-based rule-extraction from support vector machines: performance on benchmark data sets. In: Proceedings of conference on neuro-computing and evolving intelligence, Dec. 2004
Barakat N, Diederich J (2005) Eclectic rule-extraction from support vector machines. Int J Comput Intell 2(1): 59–62
Google Scholar
Barakat N, Bradley A (2007) Rule extraction from support vector machines: a sequential covering apporoach. IEEE Trans Knowl Data Eng 19(6): 729–741
Article Google Scholar
Provost F (2000) Machine Learning from Imbalanced Data Sets 101. Working Notes AAAI’00 workshop learning from imbalanced data sets, pp 1–3
Wu G, Chang E (2005) KBA: Kernel boundary alignment considering imbalance data distribution. IEEE Trans Knowl Data Eng 17(6): 786–795
Article Google Scholar
Wu G, Chang E (2003) Adaptive feature-space conformal transformation for imbalanced data learning. In: Proceedings of 20th internatuional conference on machine learning, pp 816–823
Lin Y, Lee Y, Wahba G (2002) Support vector machines for classification in nonstandard situations. Mach Learn 46: 191–101
Article MATH Google Scholar
Veropoulos K, Campbell C, Cristianini N (1999) Controlling the sensitivity of support vector machine. In: Proceedings of international joint conference on artifical intelligence, pp 55–60
Estabrooks J, Japkowicz N (2004) A multiple resampling method for learning from imbalanced data sets. Comput Intell 20: 18–36
Article MathSciNet Google Scholar
Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 19(1): 63–77
Article Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res (JAIR) 16: 321–357
MATH Google Scholar
Falco De, Della Cioppa A, Iazzetta A, Tarantino E (2005) An evolutionary approach for automatically extracting intelligible classification rules. Knowl Inf Syst 7: 179–201
Article Google Scholar

Download references

Author information

Authors and Affiliations

Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Private Bag 92006, Auckland, 1020, New Zealand
Shaoning Pang & Nikola Kasabov

Authors

Shaoning Pang
View author publications
Search author on:PubMed Google Scholar
Nikola Kasabov
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Shaoning Pang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pang, S., Kasabov, N. Encoding and decoding the knowledge of association rules over SVM classification trees. Knowl Inf Syst 19, 79–105 (2009). https://doi.org/10.1007/s10115-008-0147-1

Download citation

Received: 15 May 2007
Revised: 16 March 2008
Accepted: 25 April 2008
Published: 24 June 2008
Issue Date: April 2009
DOI: https://doi.org/10.1007/s10115-008-0147-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Encoding and decoding the knowledge of association rules over SVM classification trees

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

EasySVM: A visual analysis approach for open-box support vector machines

Customer Value Prediction in Direct Marketing Using Hybrid Support Vector Machine Rule Extraction Method

Reduction of training data for support vector machine: a survey

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Encoding and decoding the knowledge of association rules over SVM classification trees

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

EasySVM: A visual analysis approach for open-box support vector machines

Customer Value Prediction in Direct Marketing Using Hybrid Support Vector Machine Rule Extraction Method

Reduction of training data for support vector machine: a survey

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now