Skip to main content

Advertisement

Log in

A hybrid genetic algorithm for feature subset selection in rough set theory

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Rough set theory has been proven to be an effective tool to feature subset selection. Current research usually employ hill-climbing as search strategy to select feature subset. However, they are inadequate to find the optimal feature subset since no heuristic can guarantee optimality. Due to this, many researchers study stochastic methods. Since previous works of combination of genetic algorithm and rough set theory do not show competitive performance compared with some other stochastic methods, we propose a hybrid genetic algorithm for feature subset selection in this paper, called HGARSTAR. Different from previous works, HGARSTAR embeds a novel local search operation based on rough set theory to fine-tune the search. This aims to enhance GA’s intensification ability. Moreover, all candidates (i.e. feature subsets) generated in evolutionary process are enforced to include core features to accelerate convergence. To verify the proposed algorithm, experiments are performed on some standard UCI datasets. Experimental results demonstrate the efficiency of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Chen YM, Miao DQ, Wang RZ (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233

    Article  Google Scholar 

  • Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156

    Article  Google Scholar 

  • Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151:155–176

    Article  MATH  MathSciNet  Google Scholar 

  • Derrac J, Verbiest N, García S (2013) On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection. Soft Comput 17(2):223–238

    Article  Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  • Hedar AR, Wang J, Fukushima M (2008) Tabu search for attribute reduction in rough set theory. Soft Comput 12(9):909–918

    Article  MATH  Google Scholar 

  • Holland J (1992) Adaptation in nature and artificial systems. MIT Press, Cambridge

    Google Scholar 

  • Hu K, Lu YC, Shi CY (2003) Feature ranking in rough sets. AI Commun 16(1):41–50

    Google Scholar 

  • Hu QH, Xie ZX, Yu DR (2007) Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation. Pattern Recogn 40(12):3509–3521

    Article  MATH  Google Scholar 

  • Hu X, Cereone N (1995) Learning in relational databases: a rough set approach. Comput Intell 11(2):323–337

    Article  Google Scholar 

  • Inza I, Larrañaga P, Etxeberria R et al (2000) Feature subset selection by Bayesian network-based optimization. Artif Intell 123:157–184

    Article  MATH  Google Scholar 

  • Jensen R, Shen Q (2003) Finding rough set reducts with ant colony optimization. In: Proceedings of 2003 UK Workshop, Computational Intelligence, pp 15–22

  • Jensen R, Shen Q (2004) Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Set Syst 141(3):469–485

    Article  MATH  MathSciNet  Google Scholar 

  • Jing SY, She K, Ali S (2013) A universal neighbourhood rough sets model for knowledge discovering from incomplete heterogeneous data. Expert Syst 30(1):89–96

    Article  Google Scholar 

  • Krishnapuram B, Harternink AJ, Carin L et al (2004) Bayesian approach to joint feature selection and classifier design. IEEE Trans Pattern Anal Mach Intell 26(9):1105–1111

    Article  Google Scholar 

  • Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13:143–159

    Article  Google Scholar 

  • Li ST, Wu XX, Hu XY (2008) Gene selection using genetic algorithm and support vectors machines. Soft Comput 12(7):693–698

    Article  Google Scholar 

  • Lin SW, Chen SC (2012) Parameter determination and feature selection for C4.5 algorithm using scatter search approach. Soft Comput 16(1):63–75

    Article  Google Scholar 

  • Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer, Boston

    Book  MATH  Google Scholar 

  • Lozanoa M, García-Martínez C (2010) Hybrid metaheuristics with evolutionary algorithms specializing in intensification and diversification: overview and progress report. Comput Oper Res 37(3):481–497

    Article  MathSciNet  Google Scholar 

  • Miao DQ, Wang J (1997) Information-based algorithm for reduction of knowledge. In: IEEE Int Conf Intell Process Syst, Beijing, China, pp 1155–1158

  • Muni DP, Das NR, Pal J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern B 36(1):106–117

    Article  Google Scholar 

  • Newman DJ, Hettich S, Blake CL (1998) UCI repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine

  • Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–2437

    Article  Google Scholar 

  • Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356

    Article  MATH  MathSciNet  Google Scholar 

  • Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishing, Dordrecht

    Book  MATH  Google Scholar 

  • Pedrycz W (2007) Granular computing-the emerging paradigm. J Uncertain Syst 1(1):38–61

    Google Scholar 

  • Pedrycz W, Skowron A, Kreinovich V (2008) Handbook of granular computing. Wiley, New York

    Book  Google Scholar 

  • Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  • Qian YH, Liang JY (2008) Combination entropy and combination granulation in rough set theory. Int J Uncertain Fuzz Knowl Based Syst 16(2):179–193

    Article  MATH  MathSciNet  Google Scholar 

  • Shen Q, Jensen R (2004) Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring. Pattern Recogn 37(7):1351–1363

    Article  MATH  Google Scholar 

  • Shen YJ, Li TR, Hermans E et al (2010) A hybrid system of neural networks and rough sets for road safety performance indicators. Soft Comput 14(12):1255–1263

    Article  Google Scholar 

  • Skowron A, Bazan J, Son NH et al (2005) RSES 2.2 user’s guide. http://logic.mimuw.edu.pl/~rses. Institute of Mathematics, Warsaw University

  • Stefanowski J (1998) On rough set based approaches to induction of decision rules. In: Skowron A, Polkowski L (eds) Rough sets in knowledge discovery. Physica Verlag, Heidelberg, pp 500–529

    Google Scholar 

  • Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recogn 24(6):833–849

    Article  MATH  Google Scholar 

  • Tan F, Fu XZ, Zhang YQ et al (2008) A genetic algorithm-based method for feature subset selection. Soft Comput 12(2):111–120

    Article  Google Scholar 

  • Verikas A, Bacauskiene M (2002) Feature selection with neural networks. Pattern Recogn Lett 23(11):1323–1335

    Article  MATH  Google Scholar 

  • Wang GY, Yu H (2002) Decision table reduction based on conditional information entropy. Chin J Comput 25(7):759–766

    Google Scholar 

  • Wang GY (2003) Rough reduction in algebra view and information view. Int J Intell Syst 18(6):679–688

    Article  MATH  Google Scholar 

  • Wang XY, Yang J, Teng XL et al (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471

    Article  Google Scholar 

  • Wroblewski J (1995) Finding minimal reducts using genetic algorithms. In: Proceedings of second annual join conf. on information sciences, Wrightsville Beach, NC, pp 186–189

  • Xu FF, Miao DQ, Wei L (2009) Fuzzy-rough attribute reduction via mutual information with an application to cancer classification. Comput Math Appl 57(6):1010–1017

    Article  MATH  Google Scholar 

  • Zhai LY, Khoo LP, Fok SC (2002) Feature extraction using rough set theory and genetic algorithms: an application for the simplification of product quality evaluation. Comput Ind Eng 43:661–676

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Si-Yuan Jing.

Additional information

Communicated by V. Loia.

Appendix: Evolution process on different datasets

Appendix: Evolution process on different datasets

See Figs. 2, 3, 4, 5, 6 and 7.

Fig. 5
figure 5

Evolution process on lymphography

Fig. 6
figure 6

Evolution process on soybean-small

Fig. 7
figure 7

Evolution process on lung-cancer

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jing, SY. A hybrid genetic algorithm for feature subset selection in rough set theory. Soft Comput 18, 1373–1382 (2014). https://doi.org/10.1007/s00500-013-1150-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-013-1150-3

Keywords

Navigation