Skip to main content
Log in

A novel hybrid algorithm for feature selection

  • Original Article
  • Published:
Personal and Ubiquitous Computing Aims and scope Submit manuscript

Abstract

Feature selection is an important filtering method for data analysis, pattern classification, data mining, and so on. Feature selection reduces the number of features by removing irrelevant and redundant data. In this paper, we propose a hybrid filter–wrapper feature subset selection algorithm called the maximum Spearman minimum covariance cuckoo search (MSMCCS). First, based on Spearman and covariance, a filter algorithm is proposed called maximum Spearman minimum covariance (MSMC). Second, three parameters are proposed in MSMC to adjust the weights of the correlation and redundancy, improve the relevance of feature subsets, and reduce the redundancy. Third, in the improved cuckoo search algorithm, a weighted combination strategy is used to select candidate feature subsets, a crossover mutation concept is used to adjust the candidate feature subsets, and finally, the filtered features are selected into optimal feature subsets. Therefore, the MSMCCS combines the efficiency of filters with the greater accuracy of wrappers. Experimental results on eight common data sets from the University of California at Irvine Machine Learning Repository showed that the MSMCCS algorithm had better classification accuracy than the seven wrapper methods, the one filter method, and the two hybrid methods. Furthermore, the proposed algorithm achieved preferable performance on the Wilcoxon signed-rank test and the sensitivity–specificity test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Armanfard N, Reilly JP, Komeili M (2016) Local feature selection for data classification. IEEE Trans Pattern Anal Mach Intell 38:1217–1227

    Article  Google Scholar 

  2. Zeng H, Cheung YM (2011) Feature selection and kernel learning for local learning-based clustering. IEEE Trans Pattern Anal Mach Intell 33:1532–1547

    Article  Google Scholar 

  3. Wang D, Nie F, Huang H (2015) Feature selection via global redundancy minimization. IEEE Trans Knowl Data Eng 27:2743–2755

    Article  Google Scholar 

  4. Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19:711–720

    Article  Google Scholar 

  5. Lu H, Plataniotis KN, Venetsanopoulos AN (2008) MPCA: multilinear principal component analysis of tensor objects. IEEE Trans Neural Netw 19:18–39

    Article  Google Scholar 

  6. He X, Yan S, Hu Y, Niyogi P, Zhang HJ (2005) Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27:328–340

    Article  Google Scholar 

  7. Belkin M, Niyogi P (2003) Laplacian Eigenmaps for dimensionality reduction and data representation. Neural Comput 15:1373–1396

    Article  Google Scholar 

  8. Miguel GT, Ruben A, Concha B, Pedro L (2013) Comparison of metaheuristic strategies for peakbin selection in proteomic mass spectrometry data. Inf Sci 222:229–246

    Article  MathSciNet  Google Scholar 

  9. Mirjalili S (2015) The ant lion optimizer. Adv Eng Softw 83:80–98

    Article  Google Scholar 

  10. Yang XS, He X (2013) Bat algorithm: literature review and applications. Int J Bio-Inspir Com 5:141–149

    Article  Google Scholar 

  11. Rodrigues D, Pereira LAM, Nakamura RYM, Costa KAP, Yang XS, Souza AN, Papa JP (2014) A wrapper approach for feature selection based on Bat Algorithm and Optimum-Path Forest. Expert Syst Appl 41:2250–2258

    Article  Google Scholar 

  12. Passino KM (2002) Biomimicry of bacterial foraging for distributed optimization and control. IEEE Control Syst 22:52–67

    Article  Google Scholar 

  13. Chen YP, Li Y, Wang G, Zheng YF, Xu Q, Fan JH, Cui XT (2017) A novel bacterial foraging optimization algorithm for feature selection [J].Expert Syst Appl 83(C):1–17

    Article  Google Scholar 

  14. Yang XS, Deb S (2009) Cuckoo search via Lévy flights. World Congress on Nature & Biologically Inspired Computing, 210–214

  15. Mohapatra P, Chakravarty S, Dash PK (2015) An improved cuckoo search based extreme learning machine for medical data classification. Swarm Evol Compu 24:25–49

    Article  Google Scholar 

  16. Tsai CF, Eberle W, Chu CY (2013) Genetic algorithms in feature and instance selection. Knowl-Based Syst 39:240–247

    Article  Google Scholar 

  17. Wang Z, Shao YH, Wu TR (2013) A GA-based model selection for smooth twin parametric-margin support vector machine. Pattern Recogn 46:2267–2277

    Article  Google Scholar 

  18. Kennedy J, Eberhart RC (1995) Particle swarm optimization. In: Proceedings of the conference on neural networks, IEEE Perth, Australia, 1942–1948

  19. Vieira SM, Mendonc LF, Farinha GJ, Sousa JMC (2013) Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Appl Soft Comput 13:3494–3504

    Article  Google Scholar 

  20. Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680

    Article  MathSciNet  Google Scholar 

  21. Lin SW, Lee ZJ, Chen SC, Tseng TY (2008) Parameter determination of support vector machine and feature selection using simulated annealing approach. Appl Soft Comput 8:1505–1512

    Article  Google Scholar 

  22. Sebban M, Nock R (2002) A hybrid filter/wrapper approach of feature selection using information theory. Pattern Recogn 35:835–846

    Article  Google Scholar 

  23. Freeman C, Dana, Basir O (2015) An evaluation of classifier-specific filter measure performance for feature selection. Pattern Recogn 48:1812–1826

    Article  Google Scholar 

  24. Sardana M, Agrawal RK, Kaur B (2015) An incremental feature selection approach based on scatter matrices for classification of cancer microarray data. Int J Comput Math 92(2):277–295

    Article  Google Scholar 

  25. Mohamed NS, Zainudin S, Othman ZA (2017) Metaheuristic approach for an enhanced mRMR filter method for classification using drug response microarray data. Expert Syst Appl 90:224–231

    Article  Google Scholar 

  26. Yang P, Ho JW, Yang YH, Zhou BB (2011) Gene-gene interaction filtering with ensemble of filters. Bmc Bioinf 12:2901–2917

    Google Scholar 

  27. Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221

    Article  Google Scholar 

  28. Chernbumroong S, Shuang C, Yu H (2015) Maximum relevancy maximum complementary feature selection for multi-sensor activity recognition [J]. Expert Syst Appl 42(1):573–583

    Article  Google Scholar 

  29. Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238

    Article  Google Scholar 

  30. Akadi AE, Amine A, Ouardighi AE, Aboutajdine D (2011) A two-stage gene selection scheme utilizing MRMR filter and GA wrapper. Knowl Inf Syst 26:487–500

    Article  Google Scholar 

  31. Alshamlan H, Badr G, Alohali Y (2015) mRMR-abc: a hybrid gene selection algorithm for cancer classification using microarray gene expression profiling. Biomed Res Int 2015(4):1–15

    Article  Google Scholar 

  32. Unler A, Murat A, Chinnam RB (2011) Mr(2)PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf Syst 181:4625–4641

    Google Scholar 

  33. Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy [J]. Appl Soft Comput 43:117–130

    Article  Google Scholar 

  34. Yang XS, Deb S (2014) Cuckoo search: recent advances and applications. Neural Comput Applic 24(1):169–174

    Article  Google Scholar 

  35. Ouaarab A, Ahiod B, Yang X-S (2014) Discrete cuckoo search algorithm for the travelling salesman problem. Neural Comput & Applic 24(7–8):1659–1669

    Article  Google Scholar 

  36. Turhal ÜÇ, Duysak A (2015) Cross grouping strategy based 2DPCA method for face recognition. Appl Soft Comput 29:270–279

    Article  Google Scholar 

  37. Katrutsa AM, Strijov VV (2015) Stress test procedure for feature selection algorithms. Chemom Intell Lab Syst 142:172–183

    Article  Google Scholar 

  38. Berrendero JR, Cuevas A, Torrecilla JL (2014) Variable selection in functional data classification: a maxima-hunting proposal. Stat Sin 619–638. https://doi.org/10.5705/ss.202014.0014

  39. Li SY, Li TR, Liu D (2013) Incremental updating approximations in dominance-based rough sets approach under the variation of the attribute set. Knowl Based Syst 40:17–26

    Article  Google Scholar 

  40. Huang CL, Wang CJ (2006) A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31:231–240

    Article  Google Scholar 

  41. Kane MD, Jatkoe TA, Stumpf CR, Lu J, Thomas JD, Madore SJ (2000) Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucleic Acids Res 28:4552–4557

    Article  Google Scholar 

  42. Conover WJ (1973) On methods of handling ties in the Wilcoxon signed-rank test. J Am Stat Assoc 68:985–988

    Article  MathSciNet  Google Scholar 

  43. Soria D, Garibaldi JM, Ambrogi F, Biganzoli EM, Ellis IO (2011) A ‘non-parametric’ version of the naive Bayes classifier. Knowl Based Syst 24:775–784

    Article  Google Scholar 

Download references

Acknowledgments

This research is supported by the National Natural Science Foundation of China (NSFC) under grant no. 61602206.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, Y., Li, Y., Wang, G. et al. A novel hybrid algorithm for feature selection. Pers Ubiquit Comput 22, 971–985 (2018). https://doi.org/10.1007/s00779-018-1156-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00779-018-1156-z

Keywords

Navigation