Abstract
Rough set theory has been proven to be an effective tool to feature subset selection. Current research usually employ hill-climbing as search strategy to select feature subset. However, they are inadequate to find the optimal feature subset since no heuristic can guarantee optimality. Due to this, many researchers study stochastic methods. Since previous works of combination of genetic algorithm and rough set theory do not show competitive performance compared with some other stochastic methods, we propose a hybrid genetic algorithm for feature subset selection in this paper, called HGARSTAR. Different from previous works, HGARSTAR embeds a novel local search operation based on rough set theory to fine-tune the search. This aims to enhance GA’s intensification ability. Moreover, all candidates (i.e. feature subsets) generated in evolutionary process are enforced to include core features to accelerate convergence. To verify the proposed algorithm, experiments are performed on some standard UCI datasets. Experimental results demonstrate the efficiency of our algorithm.
Similar content being viewed by others
References
Chen YM, Miao DQ, Wang RZ (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156
Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151:155–176
Derrac J, Verbiest N, García S (2013) On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection. Soft Comput 17(2):223–238
Guyon I, Elisseeff A (2003) An introduction to variable feature selection. J Mach Learn Res 3:1157–1182
Hedar AR, Wang J, Fukushima M (2008) Tabu search for attribute reduction in rough set theory. Soft Comput 12(9):909–918
Holland J (1992) Adaptation in nature and artificial systems. MIT Press, Cambridge
Hu K, Lu YC, Shi CY (2003) Feature ranking in rough sets. AI Commun 16(1):41–50
Hu QH, Xie ZX, Yu DR (2007) Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation. Pattern Recogn 40(12):3509–3521
Hu X, Cereone N (1995) Learning in relational databases: a rough set approach. Comput Intell 11(2):323–337
Inza I, Larrañaga P, Etxeberria R et al (2000) Feature subset selection by Bayesian network-based optimization. Artif Intell 123:157–184
Jensen R, Shen Q (2003) Finding rough set reducts with ant colony optimization. In: Proceedings of 2003 UK Workshop, Computational Intelligence, pp 15–22
Jensen R, Shen Q (2004) Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Set Syst 141(3):469–485
Jing SY, She K, Ali S (2013) A universal neighbourhood rough sets model for knowledge discovering from incomplete heterogeneous data. Expert Syst 30(1):89–96
Krishnapuram B, Harternink AJ, Carin L et al (2004) Bayesian approach to joint feature selection and classifier design. IEEE Trans Pattern Anal Mach Intell 26(9):1105–1111
Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13:143–159
Li ST, Wu XX, Hu XY (2008) Gene selection using genetic algorithm and support vectors machines. Soft Comput 12(7):693–698
Lin SW, Chen SC (2012) Parameter determination and feature selection for C4.5 algorithm using scatter search approach. Soft Comput 16(1):63–75
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer, Boston
Lozanoa M, García-Martínez C (2010) Hybrid metaheuristics with evolutionary algorithms specializing in intensification and diversification: overview and progress report. Comput Oper Res 37(3):481–497
Miao DQ, Wang J (1997) Information-based algorithm for reduction of knowledge. In: IEEE Int Conf Intell Process Syst, Beijing, China, pp 1155–1158
Muni DP, Das NR, Pal J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern B 36(1):106–117
Newman DJ, Hettich S, Blake CL (1998) UCI repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine
Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–2437
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishing, Dordrecht
Pedrycz W (2007) Granular computing-the emerging paradigm. J Uncertain Syst 1(1):38–61
Pedrycz W, Skowron A, Kreinovich V (2008) Handbook of granular computing. Wiley, New York
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Qian YH, Liang JY (2008) Combination entropy and combination granulation in rough set theory. Int J Uncertain Fuzz Knowl Based Syst 16(2):179–193
Shen Q, Jensen R (2004) Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring. Pattern Recogn 37(7):1351–1363
Shen YJ, Li TR, Hermans E et al (2010) A hybrid system of neural networks and rough sets for road safety performance indicators. Soft Comput 14(12):1255–1263
Skowron A, Bazan J, Son NH et al (2005) RSES 2.2 user’s guide. http://logic.mimuw.edu.pl/~rses. Institute of Mathematics, Warsaw University
Stefanowski J (1998) On rough set based approaches to induction of decision rules. In: Skowron A, Polkowski L (eds) Rough sets in knowledge discovery. Physica Verlag, Heidelberg, pp 500–529
Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recogn 24(6):833–849
Tan F, Fu XZ, Zhang YQ et al (2008) A genetic algorithm-based method for feature subset selection. Soft Comput 12(2):111–120
Verikas A, Bacauskiene M (2002) Feature selection with neural networks. Pattern Recogn Lett 23(11):1323–1335
Wang GY, Yu H (2002) Decision table reduction based on conditional information entropy. Chin J Comput 25(7):759–766
Wang GY (2003) Rough reduction in algebra view and information view. Int J Intell Syst 18(6):679–688
Wang XY, Yang J, Teng XL et al (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471
Wroblewski J (1995) Finding minimal reducts using genetic algorithms. In: Proceedings of second annual join conf. on information sciences, Wrightsville Beach, NC, pp 186–189
Xu FF, Miao DQ, Wei L (2009) Fuzzy-rough attribute reduction via mutual information with an application to cancer classification. Comput Math Appl 57(6):1010–1017
Zhai LY, Khoo LP, Fok SC (2002) Feature extraction using rough set theory and genetic algorithms: an application for the simplification of product quality evaluation. Comput Ind Eng 43:661–676
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Jing, SY. A hybrid genetic algorithm for feature subset selection in rough set theory. Soft Comput 18, 1373–1382 (2014). https://doi.org/10.1007/s00500-013-1150-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-013-1150-3