ABSTRACT
Modern massive data sets often comprise of millions of records and thousands of features. Their efficient processing by traditional methods represents an increasing challenge. Feature selection methods form a family of traditional instruments for data dimensionality reduction. They aim at selecting subsets of data features so that the loss of information, contained in the full data set, is minimized. Evolutionary feature selection methods have shown good ability to identify feature subsets in very-high-dimensional data sets. Their efficiency depends, among others, on a particular optimization algorithm, feature subset representation, and objective function definition. In this paper, two evolutionary methods for fixed-length subset selection are employed to find feature subsets on the basis of their entropy, estimated by a fast data compression algorithm. The reasonability of the fitness criterion, ability of the investigated methods to find good feature subsets, and the usefulness of selected feature subsets for practical data mining, is evaluated using two well-known data sets and several widely-used classification algorithms.
- M. Affenzeller, S. Winkler, S. Wagner, and A. Beham. Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Applications. Chapman & Hall/CRC, 2009. Google ScholarDigital Library
- C. C. Aggarwal and P. S. Yu. Outlier detection for high dimensional data. SIGMOD Rec., 30(2):37--46, 2001. Google ScholarDigital Library
- A. L. Berger, V. J. D. Pietra, and S. A. D. Pietra. A maximum entropy approach to natural language processing. Comput. Linguist., 22(1):39--71, 1996. Google ScholarDigital Library
- J. Biesiada, W. Duch, A. Kachel, K. Maczka, and S. Palucha. Feature ranking methods based on information entropy with parzen windows. In Int. Conf. on Research in Electrotechnology and Applied Informatics, vol. 1, p. 1, 2005.Google Scholar
- M. Burtscher and P. Ratanaworabhan. Fpc: A high-speed compressor for double-precision floating-point data. IEEE Trans. on Computers, 58(1):18--31, 2009. Google ScholarDigital Library
- V. A. Cicirello. Non-wrapping order crossover: An order preserving crossover operator that respects absolute position. In Proc. of the 8th Annual Conf. on Genetic and Evolutionary Computation, GECCO '06, pp. 1125--1132, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- R. Cilibrasi and P. Vitanyi. Clustering by compression. Information Theory, IEEE Trans. on, 51(4):1523--1545, 2005. Google ScholarDigital Library
- A. Czarn, C. MacNish, K. Vijayan, and B. A. Turlach. Statistical exploratory analysis of genetic algorithms: The influence of gray codes upon the difficulty of a problem. In Australian Conf. on Art. Int., vol. 3339 of LNCS, pp. 1246--1252. Springer, 2004. Google ScholarDigital Library
- R. Diao and Q. Shen. Nature inspired feature selection meta-heuristics. Art. Int. Rev., 44(3):311--340, 2015. Google ScholarDigital Library
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer New York, 2013.Google Scholar
- P. Jaganathan and R. Kuppuchamy. A threshold fuzzy entropy based feature selection for medical database classification. Computers in Biology and Medicine, 43(12):2222 -- 2229, 2013. Google ScholarDigital Library
- F. Jiang, Y. Sui, and L. Zhou. A relative decision entropy-based feature selection approach. Pattern Recognition, 48(7):2151 -- 2163, 2015. Google ScholarDigital Library
- A. N. Kolmogorov. Three approaches to the quantitative definition of information. Problems of Information Transmission, 1(1):1--7, 1965.Google Scholar
- P. Kromer, J. Platos, and V. Snasel. Traditional and self-adaptive differential evolution for the p-median problem. In Cybernetics (CYBCONF), 2015 IEEE 2nd Int. Conf. on, pp. 299--304, 2015.Google ScholarCross Ref
- P. Krömer and J. Platos. Genetic algorithm for sampling from scale-free data and networks. In Proc. of the 2014 Annual Conf. on Genetic and Evolutionary Computation, GECCO '14, pp. 793--800, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- C. Largeron, C. Moulin, and M. Géry. Entropy based feature selection for text categorization. In Proc. of the 2011 ACM Symposium on Applied Computing, SAC '11, pp. 924--928, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- M. Li, X. Chen, X. Li, B. Ma, and P. Vitanyi. The similarity metric. Inf. Theory, IEEE Trans. on, 50(12):3250--3264, 2004. Google ScholarDigital Library
- M. Lichman. UCI machine learning repository, 2013.Google Scholar
- H. Liu and H. Motoda. Feature Selection for Knowledge Discovery and Data Mining. The Springer Int. Series in Eng. and Comp. Sci. Springer US, 2013.Google Scholar
- M. Nilsson and W. Kleijn. On the estimation of differential entropy from data located on embedded manifolds. Information Theory, IEEE Trans. on, 53(7):2330--2341, 2007. Google ScholarDigital Library
- K. V. Price, R. M. Storn, and J. A. Lampinen. Differential Evolution A Practical Approach to Global Optimization. Natural Comp. Series. Springer-Verlag, Berlin, Germany, 2005. Google ScholarDigital Library
- J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA, USA, 1993. Google ScholarDigital Library
- N. Salkind. Encyclopedia of Measurement and Statistics. SAGE Publications, 2006.Google Scholar
- C. Shannon. A mathematical theory of communication. Bell System Technical Journal, The, 27(3):379--423, 1948.Google Scholar
- M. Verleysen and D. François. IWANN 2005, Proc. of, ch. The Curse of Dimensionality in Data Mining and Time Series Prediction, pp. 758--770. Springer Berlin Heidelberg, 2005. Google ScholarDigital Library
- P. M. B. Vitányi. Universal similarity. In Information Theory Workshop, 2005 IEEE, pp. 6 pp.--, Aug 2005.Google ScholarCross Ref
- A. S. Wu, R. K. Lindsay, and R. Riolo. Empirical observations on the roles of crossover and mutation. In T. Back, editor, Proc. of the Seventh Int. Conf. on Genetic Algorithms, pp. 362--369, San Francisco, CA, 1997. Morgan Kaufmann.Google Scholar
- H. Xiong, G. Pandey, M. Steinbach, and V. Kumar. Enhancing data analysis with noise removal. IEEE Trans. on Knowl. and Data Eng., 18(3):304--319, 2006. Google ScholarDigital Library
- K. Zeng, K. She, and X. Niu. Feature selection with neighborhood entropy-based cooperative game theory. Comp. int. and neuroscience, 2014:11, 2014. Google ScholarDigital Library
- Y. Zheng and C. K. Kwoh. A feature subset selection method based on high-dimensional mutual information. Entropy, 13(4):860, 2011.Google ScholarCross Ref
Index Terms
- Evolutionary Feature Subset Selection with Compression-based Entropy Estimation
Recommendations
Feature subset selection using multimodal multiobjective differential evolution
AbstractThe main aim of feature subset selection is to find the minimum number of required features to perform classification without affecting the accuracy. It is one of the useful real-world applications for different types of classification ...
Highlights- Multimodal multiobjective feature subset selection using differential evolution is explored in this paper.
Scalable feature subset selection for big data using parallel hybrid evolutionary algorithm based wrapper under apache spark environment
AbstractExtant sequential wrapper-based feature subset selection (FSS) algorithms are not scalable and yield poor performance when applied to big datasets. Hence, to circumvent these challenges, we propose parallel and distributed hybrid evolutionary ...
Feature subset selection based on fuzzy entropy measures for handling classification problems
In this paper, we present a new method for dealing with feature subset selection based on fuzzy entropy measures for handling classification problems. First, we discretize numeric features to construct the membership function of each fuzzy set of a ...
Comments