Abstract
In this paper, we propose novel methods to find the best relevant feature subset using fuzzy rough set-based attribute subset selection with biologically inspired algorithm search such as ant colony and particle swarm optimization and the principles of an evolutionary process. We then propose a hybrid fuzzy rough with K-nearest neighbor (K-NN)-based classifier (FRNN) to classify the patterns in the reduced datasets, obtained from the fuzzy rough bio-inspired algorithm search. While exploring other possible hybrid evolutionary processes, we then conducted experiments considering (i) same feature selection algorithm with support vector machine (SVM) and random forest (RF) classifier; (ii) instance based selection using synthetic minority over-sampling technique with fuzzy rough K-nearest neighbor (K-NN), SVM and RF classifier. The proposed hybrid is subsequently validated using real-life datasets obtained from the University of California, Irvine machine learning repository. Simulation results demonstrate that the proposed hybrid produces good classification accuracy. Finally, parametric and nonparametric statistical tests of significance are carried out to observe consistency of the classifiers.





Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Mitra S, Pal SK, Mitra P (2002) Data mining in soft computing framework: a survey. IEEE Trans Neural Networks 13:3–14
Zhong N et al (2001) Using rough sets with heuristics for feature selection. J Intell Inf Syst 16:199–214
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324
Whitney A (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 9(C-20):1100–1103
Marill T, Green D (1963) On the effectiveness of receptors in recognition systems. IEEE Trans Inf Theory 9(1):11–17
Mohemmed A, Zhang M, Johnston M (2009) Particle swarm optimization based adaboost for face detection. In: IEEE congress on evolutionary computation (CEC’09), pp 2494–2501
Neshatian K, Zhang M (2009) Dimensionality reduction in face detection: a genetic programming approach. In: 24th international conference image and vision computing New Zealand (IVCNZ’09), pp 391–396
Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539
Yang CS, Chuang LY, Ke CH, Yang CH (2008) Boolean binary particle swarm optimization for feature selection. In: IEEE congress on evolutionary computation (CEC’08), pp 2093–2098
Yuan H, Tseng SS, Gangshan W (1999) A two-phase feature selection method using both filter and wrapper. In: IEEE international conference on systems, man, and cybernetics (SMC’99), vol 2, pp 132–136
Kennedy J, Spears W (1998) Matching algorithms to problems: an experimental test of the particle swarm and some genetic algorithms on the multimodal problem generator. In: IEEE congress on evolutionary computation (CEC’98), pp 78–83
Qablan T, Al-Radaidehl QA, Abu Shuqeir S (2012) A reduct computation approach based on ant colony optimization. Basic Sci Eng 21(1):29–40
Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31:226–233
Wang J, Xu M, Wang H, Zhang J (2007) Classification of imbalanced data by using the SMOTE algorithm and locally linear embedding. In: International conference on signal processing proceedings, 4129201
Chandana S, Leung H, Trpkov K (2009) Staging of prostate cancer using automatic feature selection, sampling and Dempster–Shafer fusion. Cancer Inform 7:57–73
Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11(5):341–356
Mi JS, Wu WZ, Zhang WX (2004) Approaches to knowledge reduction based on variable precision rough set model. Inform Sci 159(3–4):255–272
Saha M, Sil J, Sengupta N (2013) Genetic algorithm and fuzzy-rough based dimensionality reduction applied on real valued dataset. Int J Comput Inf Syst Ind Manag Appl 5:462–471
Lingras P, Jensen R (2007) Survey of rough and fuzzy hybridization. In: Proceedings of the 16th international conference fuzzy systems, pp 125–130
Jensen R, Shen Q (2009) New approaches to fuzzy-rough feature selection. IEEE Trans Fuzzy Syst 17(4):824–838
Pedrycz W, Skowron A (2001) Rough sets and fuzzy sets in data mining. In: Zytkow W, Klosgen W (eds) Handbook of knowledge discovery & data mining. Oxford University Press
Keller JM, Gray MR, Givens JA (1985) A fuzzy K-nearest neighbor algorithm. IEEE Trans Syst Man Cybernet 15(4):580–585
Sarkar M (2007) Fuzzy-rough nearest neighbors algorithm. Fuzzy Sets Syst 158:2123–2152
Jones DT (1999) Protein secondary structure prediction based on position specific scoring matrices. J Mol Biol 292:195–202
Panda M, Patra MR (2009) Mining knowledge from network intrusion data using data mining techniques. In: Dehuri SN et al (eds) Knowledge mining using intelligent agents. World Scientific, Singapore
Panda M, Patra MR (2009) Ensemble voting system for anomaly based network intrusion detection. Int J Recent Trends Eng 2(5):8–13
Dehuri SN, Nanda BK, Cho S-B (2009) A hybrid APSO-aided learnable Bayesian classifier. In: Proceedings of Indian international conference on artificial intelligence (IICAI), pp 695–706
Xue B, Zhang M, Browne WN (2012) Multi-objective particle swarm optimisation (PSO) for feature selection, GECCO’12, July 7–11, 2012. ACM Press, Philadelphia, Pennsylvania, USA, pp 81–88
Grosan C, Abraham A, Chis M (2006) Swarm intelligence in data mining. In: Abraham A et al (eds) Studies in computational intelligence series, vol 34. Springer, Berlin
Abraham A, Guo H, Liu H (2006) Swarm intelligence: foundations, perspectives and applications. In: Abraham A et al (eds) Swarm intelligence: foundations, perspectives and applications, studies in computational intelligence (SCI), vol 26. Springer, Germany, pp 3–25
Suguna N, Thanushkodi K (2010) A novel rough set reduct algorithm for medical domain based on bee colony optimization. J Comput 2(6):49–54
Ding S, Chen J, Xu X, Li J (2011) Rough neural networks: a review. J Comput Inf Syst 7(7):2338–2346
Fazayeli F, Wang L, Mandziuk J (2008) Feature selection based on the rough set theory and EM clustering algorithm. In: Proceedings of the 6th international conference on rough sets and current trends in computing, Springer, pp 272–282
Wang KJ, Adrian AM (2013) Breast cancer classification using hybrid synthetic minority over-sampling technique and artificial immune recognition system algorithm. Int J Comput Sci Electron Eng (IJCSEE) 1(3):408–412
Wanga X, Yanga J, Jensenb R, Liua X (2006) Rough set feature selection and rule induction for prediction of malignancy degree in brain glioma. Comput Methods Programs Biomed 83:147–156
Derrac J, Cornelis C, Garcıa S, Herrera1 F (2011) A preliminary study on the use of fuzzy rough set based feature selection for improving evolutionary instance selection algorithms. In: Cabestany J, Rojas I, Joya G (eds) IWANN 2011, part I, LNCS 6691, pp 174–182
Ganivada A, Raya SS, Pal SK (2013) Fuzzy rough sets, and a granular neural network for unsupervised feature selection. Neural Netw 48:91–108
Sabzevari R, Montazer GA (2008) An intelligent data mining approach using neuro-rough hybridization to discover hidden knowledge from information systems. J Inf Sci Eng 24:1111–1126
Sangeetha R, Kalpana B (2013) Enhanced fuzzy roughset based feature selection strategy using differential evolution. Int J Comput Sci Appl (TIJCSA) 2(06):13–20
Hu X, Shi Y, Eberhart RC (2004) Recent advances in particle swarm. In: Proceedings of congress on evolutionary computation (CEC), Portland, Oregon, pp 90–97
Kennedy J, Eberhart RC (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, vol 4. Perth, Australia, IEEE Service Center, Piscataway, NJ, pp 1942–1948
Kennedy J (1997) Minds and cultures: particle swarm implications. Socially intelligent agents. Papers from the 1997 AAAI fall symposium. Technical report FS-97-02. AAAI Press, Menlo Park, CA, pp 67–72
Kennedy J (1998) The behavior of particles. In: Proceedings of 7th annual conference on evolutionary programming. San Diego, USA
Kennedy J (1997) The particle swarm: social adaptation of knowledge. In: Proceedings of IEEE international conference on evolutionary computation. Indianapolis, Indiana, IEEE Service Center, Piscataway, NJ, pp 303–308
Kennedy J (1997) Thinking is social: experiments with the adaptive culture model. J Confl Resolut 42:56–76
Pomeroy P (2003) An introduction to particle swarm optimization. http://www.adaptiveview.com/articles/ipsop1.html
Dorigo M, Blum C (2005) Ant colony optimization theory: a survey. Theoret Comput Sci 344(2–3):243–278
Dorigo M, Di Caro G, Gambardella LM (1999) Ant algorithms for discrete optimization. Artif Life 5(2):137–172
Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66
Dorigo M, Bonaneau E, Theraulaz G (2000) Ant algorithms and stigmergy. Future Gener Comput Syst 16:851–871
Toksari MD (2006) Ant colony optimization for finding the global minimum. Appl Math Comput 176(1):308–316
Chowla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chen D, Zhang L, Zhao S, Hu Q, Zhu P (2012) A novel algorithm for finding reducts with fuzzy rough sets. IEEE Trans Fuzzy Syst 20(2):385–389
Bhatt RB, Gopal M (2005) On fuzzy-rough sets approach to feature selection. Pattern Recogn Lett 26(7):965–975
Thangavel K, Pethalakshmi A, Jaganathan P (2006) A comparative analysis of feature selection algorithms based on rough set theory. Int J Soft Comput 1(4):288–294
Wang X, Han D, Han C (2012) Fuzzy-rough set based attribute reduction with a simple fuzzification method. In: IEEE control and decision conference (CCDC), pp 3793–3797
Keller JM, Gray MR, Givens JA (1985) A fuzzy K-nearest neighbor algorithm. IEEE Trans Syst Man Cybernet 15(4):580–585
Sarkar M (2007) Fuzzy-rough nearest neighbors algorithm. Fuzzy Sets Syst 158:2123–2152
Wang X, Yang J, Teng X, Peng N (2005) Fuzzy-rough set based nearest neighbor clustering classification algorithm. Lect Notes Comput Sci 3613:370–373
Platt J (1999) SVM by sequential minimal optimization (SMO). ACM Press, USA
Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data, July 2004
Liang G, Zhang C (2011) Empirical study of bagging predictors on medical data. In: Proceedings of the 9-th Australasian data mining conference (AusDM’11), vol 121, data mining and analytics. Ballarat, Australia, CRPIT, pp 31–40
Trawiński B, Smętek M, Telec Z, Lasota T (2012) Nonparametric statistical analysis for multiple comparison of machine learning regression algorithms. Int J Appl Math Comput Sci 22(4):867–881
Howell DC (2013) Statistical methods for psychology, 8th edn. Cengage Wadsworth, Belmont, CA
Smucker MD, Allan J, Carterette B (2007) A comparison of statistical significance tests for information retrieval evaluation, CIKM’07, November 6–8, 2007, ACM Press, Lisboa, Portugal, pp 623–632
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html
Witten IH, Frank E (2005) Data mining-practical machine learning tools and techniques, 2nd edn. Morgan Kauffman Publishers, Elsevier, Amsterdam
Derrac J, Cornelis C, Garcia S, Herrera F (2011) A preliminary study on the use of fuzzy rough set based feature selection for improving evolutionary IS algorithms. In: Cabestany J, Rojas I, Jaya G (eds) IWANN 2011, part-1, LNCS 6691, pp 174–182
Wang KJ, Adrian AM (2013) Breast cancer classification using hybrid synthetic minority oversampling technique and artificial immune recognition system algorithm. Int J Comput Sci Electron Eng 1(3):408–412
Hu Q, Yu D, Xie Z (2005) A hybrid attribute reduction for classification based on a fuzzy roughest technique. Fifth SIAM international conference on data mining, pp 195–204
Wang X, Yang J, Tang X, Xia W, Jensen R (2007) Feature selection based on roughest and particle swarm optimization. Pattern Recogn Lett 28:459–471
Tan KC, Teoh EJ, Yu Q, Goh KC (2009) A hybrid evolutionary algorithm for attribute selection in data mining. Exp Syst Appl 36:8616–8630
Homlich M, Ramdani M (2012) Data classification by fuzzy ant-miner. Int J Comput Stud 19(3–3):201–206
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Panda, M., Abraham, A. Hybrid evolutionary algorithms for classification data mining. Neural Comput & Applic 26, 507–523 (2015). https://doi.org/10.1007/s00521-014-1673-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-014-1673-2