Skip to main content

Advertisement

Log in

Artificial bee colony-based support vector machines with feature selection and parameter optimization for rule extraction

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Support vector machine (SVM) is a state-of-art classification tool with good accuracy due to its ability to generate nonlinear model. However, the nonlinear models generated are typically regarded as incomprehensible black-box models. This lack of explanatory ability is a serious problem for practical SVM applications which require comprehensibility. Therefore, this study applies a C5 decision tree (DT) to extract rules from SVM result. In addition, a metaheuristic algorithm is employed for the feature selection. Both SVM and C5 DT require expensive computation. Applying these two algorithms simultaneously for high-dimensional data will increase the computational cost. This study applies artificial bee colony optimization (ABC) algorithm to select the important features. The proposed algorithm ABC–SVM–DT is applied to extract comprehensible rules from SVMs. The ABC algorithm is applied to implement feature selection and parameter optimization before SVM–DT. The proposed algorithm is evaluated using eight datasets to demonstrate the effectiveness of the proposed algorithm. The result shows that the classification accuracy and complexity of the final decision tree can be improved simultaneously by the proposed ABC–SVM–DT algorithm, compared with genetic algorithm and particle swarm optimization algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Baesens B, Van Gestel T, Viaene S, Stepanova M, Suykens J, Vanthienen J (2003) Benchmarking state-of-the-art classification algorithms for credit scoring. J Oper Res Soc 54:627–635

    Article  MATH  Google Scholar 

  2. Barakat N, Bradley AP (2010) Rule extraction from support vector machines: a review. Neurocomputing 74:178–190

    Article  Google Scholar 

  3. Barakat N, Diederich J (2004) Learning-based rule-extraction from support vector machines. In: The 14th international conference on computer theory and applications ICCTA’2004. Publishing, Alexandria, Egypt

  4. Berry MJ, Linoff G (1997) Data mining techniques: for marketing, sales, and customer support. Wiley, New York

    Google Scholar 

  5. Brown de Colstoun EC, Story MH, Thompson C, Commisso K, Smith TG, Irons JR (2003) National Park vegetation mapping using multitemporal Landsat 7 data and a decision tree classifier. Remote Sens Environ 85:316–327

    Article  Google Scholar 

  6. Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167

    Article  Google Scholar 

  7. Cervantes J, García Lamont F, López-Chau A, Rodríguez Mazahua L, Sergio Ruíz J (2015) Data selection based on decision tree for SVM classification on large data sets. Appl Soft Comput 37:787–798

    Article  Google Scholar 

  8. Chen M-S, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8:866–883

    Article  Google Scholar 

  9. Chen Z, Li J, Wei L (2007) A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue. Artif Intell Med 41:161–175

    Article  Google Scholar 

  10. Downs T, Gates KE, Masters A (2002) Exact simplification of support vector solutions. J Mach Learn Res 2:293–297

    MathSciNet  MATH  Google Scholar 

  11. Eesa AS, Orman Z, Brifcani AMA (2015) A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems. Expert Syst Appl 42:2670–2679

    Article  Google Scholar 

  12. Eiben AE, Hinterding R, Michalewicz Z (1999) Parameter control in evolutionary algorithms. IEEE Trans Evol Comput 3:124–141

    Article  Google Scholar 

  13. Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (1996) Advances in knowledge discovery and data mining. American Association for Artificial Intelligence, Menlo Park

    Google Scholar 

  14. Fong S, Wong R, Vasilakos AV (2016) Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans Serv Comput 9:33–45

    Google Scholar 

  15. Frohlich H, Chapelle O, Scholkopf B (2003) Feature selection for support vector machines by means of genetic algorithm. In: 15th IEEE international conference on tools with artificial intelligence. Publishing, pp 142–148

  16. Fung G, Sandilya S, Rao RB (2005) Rule extraction from linear support vector machines.In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. Publishing, Chicago, pp 32–40

  17. Garcia S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9:2677–2694

    MATH  Google Scholar 

  18. Hsu C-W, Chang C-C, Lin C-J (2003) A practical guide to support vector classification. Publishing, Taipei

    Google Scholar 

  19. Hsu C-W, Lin C-J (2002) A simple decomposition method for support vector machines. Mach Learn 46:291–314

    Article  MATH  Google Scholar 

  20. Huang C-L, Wang C-J (2006) A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31:231–240

    Article  Google Scholar 

  21. Im J, Jensen JR (2005) A change detection model based on neighborhood correlation image analysis and decision tree classification. Remote Sens Environ 99:326–340

    Article  Google Scholar 

  22. Jansen T, Wegener I (2005) Real royal road functions—where crossover provably is essential. Discret Appl Math 149:111–125

    Article  MathSciNet  MATH  Google Scholar 

  23. Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. Springer, Berlin

    Google Scholar 

  24. Joachims T (2005) A support vector method for multivariate performance measures. In: Proceedings of the 22nd international conference on Machine learning. Publishing, pp 377–384

  25. Kao Y-T, Zahara E (2008) A hybrid genetic algorithm and particle swarm optimization for multimodal functions. Appl Soft Comput 8:849–857

    Article  Google Scholar 

  26. Karaboga D (2005b) An idea based on honey bee swarm for numerical optimization. Technical Report TR06, Erciyes University Press, Erciyes

  27. Karaboga D, Akay B (2009) A comparative study of artificial bee colony algorithm. Appl Math Comput 214:108–132

    MathSciNet  MATH  Google Scholar 

  28. Karaboga D, Akay B (2011) A modified artificial bee colony (ABC) algorithm for constrained optimization problems. Appl Soft Comput 11:3021–3031

    Article  Google Scholar 

  29. Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J Glob Optim 39:459–471

    Article  MathSciNet  MATH  Google Scholar 

  30. Karaboga D, Basturk B (2008) On the performance of artificial bee colony (ABC) algorithm. Appl Soft Comput 8:687–697

    Article  Google Scholar 

  31. Kecman V (2001) Learning and soft computing: support vector machines, neural networks, and fuzzy logic models. MIT press, Cambridge

    MATH  Google Scholar 

  32. Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29:905–910

    Article  MATH  Google Scholar 

  33. Kuo R, Zulvia FE, Suryadi K (2012) Hybrid particle swarm optimization with genetic algorithm for solving capacitated vehicle routing problem with fuzzy demand-A case study on garbage collection system. Appl Math Comput 219:2574–2588

    MathSciNet  MATH  Google Scholar 

  34. Lajnef T, Chaibi S, Ruby P, Aguera P-E, Eichenlaub J-B, Samet M, Kachouri A, Jerbi K (2015) Learning machines and sleeping brains: automatic sleep stage classification using decision-tree multi-class support vector machines. J Neurosci Methods 250:94–105

    Article  Google Scholar 

  35. LaValle SM, Branicky MS, Lindemann SR (2004) On the relationship between classical grid search and probabilistic roadmaps. Int J Robot Res 23:673–692

    Article  Google Scholar 

  36. Lian C, Ruan S, Denœux T (2015) An evidential classifier based on feature selection and two-step classification strategy. Pattern Recognit 48:2318–2327

    Article  Google Scholar 

  37. Lin H-T, Lin C-J (2003) A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. Technical report, Department of Computer Science, National Taiwan University

  38. Lin L (2013) Integration of particle swarm k-means optimization algorithm and granular computing for imbalanced data classification problem—a case study on prostate cancer prognosis. National Taiwan Unversity of Science and Technology, Taipei

    Google Scholar 

  39. Martens D, Baesens B, Van Gestel T (2009) Decompositional rule extraction from support vector machines by active learning. IEEE Trans Knowl Data Eng 21:178–191

    Article  Google Scholar 

  40. Martens D, Baesens B, Van Gestel T, Vanthienen J (2007) Comprehensible credit scoring models using rule extraction from support vector machines. Eur J Oper Res 183:1466–1476

    Article  MATH  Google Scholar 

  41. Moschitti A, Pighin D, Basili R (2006) Semantic role labeling via tree kernel joint inference. In: Proceedings of the tenth conference on computational natural language learning. Publishing, pp 61–68

  42. Pan Y, Zomaya AY (2007) Current methods for protein secondary-structure prediction based on support vector machines. Wiley, Hoboken

    Google Scholar 

  43. Pontil M, Verri A (1998) Support vector machines for 3D object recognition. IEEE Trans Pattern Anal Mach Intell 20:637–646

    Article  Google Scholar 

  44. Prugel-Bennett A (2010) Benefits of a population: five mechanisms that advantage population-based algorithms. IEEE Trans Evol Comput 14:500–517

    Article  Google Scholar 

  45. Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  46. Quinlan JR (1993a) C4. 5: programs for machine learning, vol 1. Morgan kaufmann, San Francisco

    Google Scholar 

  47. Quinlan JR (1993b) Data mining with decision trees: theory and applications. Morgan Kaufmann, San Francisco

    Google Scholar 

  48. Samadzadegan F, Saeedi S (2009) Clustering of lidar data using particle swarm optimization algorithm in urban area. Laserscanning 09(38):334–339

    Google Scholar 

  49. Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization and beyond. MIT press, Cambridge

    Google Scholar 

  50. Sugumaran V, Muralidharan V, Ramachandran K (2007) Feature selection using decision tree and classification through proximal support vector machine for fault diagnostics of roller bearing. Mech Syst Signal Process 21:930–942

    Article  Google Scholar 

  51. Sulaiman MH, Mustafa MW, Shareef H, Abd Khalid SN (2012) An application of artificial bee colony algorithm with least squares support vector machine for real and reactive power tracing in deregulated power system. Int J Electr Power Energy Syst 37:67–77

    Article  Google Scholar 

  52. Torres D, Rocco C (2005) Extracting trees from trained SVM models using a TREPAN based approach. In: Fifth international conference on hybrid intelligent systems, 2005. HIS’05. Publishing, Rio de Janeiro, p 6

  53. Van Gestel T, Suykens JA, Baesens B, Viaene S, Vanthienen J, Dedene G, De Moor B, Vandewalle J (2004) Benchmarking least squares support vector machine classifiers. Mach Learn 54:5–32

    Article  MATH  Google Scholar 

  54. Van Gestel T, Suykens JA, Baestaens D-E, Lambrechts A, Lanckriet G, Vandaele B, De Moor B, Vandewalle J (2001) Financial time series prediction using least squares support vector machines within the evidence framework. IEEE Trans Neural Netw 12:809–821

    Article  Google Scholar 

  55. Vapnik V (2000) The nature of statistical learning theory. Springer, New York

    Book  MATH  Google Scholar 

  56. Varma MKS, Rao NKK, Raju KK, Varma GPS (2016) Pixel-based classification using support vector machine classifier. In: 2016 IEEE 6th international conference on advanced computing (IACC). Publishing pp 51–55

  57. Wang J, Li T, Ren R (2010) A real time IDSs based on artificial bee colon-support vector machine algorithm. In: Third international workshop on advance computational intelligence, Suzhou, Jiangsu, pp 91–96

  58. Waske B, Benediktsson JA (2007) Fusion of support vector machines for classification of multisensor data. IEEE Trans Geosci Remote Sens 45:3858–3866

    Article  Google Scholar 

  59. Witt C (2008) Population size versus runtime of a simple evolutionary algorithm. Theor Comput Sci 403:104–120

    Article  MathSciNet  MATH  Google Scholar 

  60. Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm, Feature extraction, construction and selection. Publishing, pp 117–136

  61. Yu G-X, Ostrouchov G, Geist A, Samatova NF (2003) An SVM-based algorithm for identification of photosynthesis-specific genome features. In: Proceedings of the 2003 IEEE bioinformatics conference, 2003. CSB 2003. Publishing, Stanford, pp 235–243

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. J. Kuo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kuo, R.J., Huang, S.B.L., Zulvia, F.E. et al. Artificial bee colony-based support vector machines with feature selection and parameter optimization for rule extraction. Knowl Inf Syst 55, 253–274 (2018). https://doi.org/10.1007/s10115-017-1083-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-017-1083-8

Keywords

Navigation