Abstract
More and more data is being collected due to constant improvements in storage hardware and data collection techniques. The incoming flow of data is so much that data mining techniques cannot keep up with. The data collected often has redundant or irrelevant features/instances that limit classification performance. Feature selection and instance selection are processes that help reduce this problem by eliminating useless data. This paper develops a set of algorithms using Differential Evolution to achieve feature selection, instance selection, and combined feature and instance selection. The reduction of the data, the classification accuracy and the training time are compared with the original data and existing algorithms. Experiments on ten datasets of varying difficulty show that the newly developed algorithms can successfully reduce the size of the data, and maintain or increase the classification performance in most cases. In addition, the computational time is also substantially reduced. This work is the first time for systematically investigating a series of algorithms on feature and/or instance selection in classification and the findings show that instance selection is a much harder task to solve than feature selection, but with effective methods, it can significantly reduce the size of the data and provide great benefit.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmad, S.S.S.: Feature and instances selection for nearest neighbor classification via cooperative PSO. In: 2014 Fourth World Congress on Information and Communication Technologies (WICT), pp. 45–50. IEEE (2014)
Ahmed, S., Zhang, M., Peng, L., Xue, B.: Multiple feature construction for effective biomarker identification and classification using genetic programming. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO), pp. 249–256. ACM (2014)
Al-Ani, A., Alsukker, A., Khushaba, R.N.: Feature subset selection using differential evolution and a wheel based search strategy. Swarm Evol. Comput. 9, 15–26 (2013)
Bharathi, P.T., Subashini, P.: Differential evolution and genetic algorithm based feature subset selection for recognition of river ice type. J. Theor. Appl. Inf. Technology 7(1), 254–262 (2014)
Bharathi, P.T., Subashini, P.: Optimal feature subset selection using differential evolution and extreme learning machine. Int. J. Sci. Res. (IJSR) 3, 1898–1905 (2014)
Das, S., Suganthan, P.N.: Differential evolution: a survey of the state-of-the-art. IEEE Trans. Evol. Comput. 15(1), 4–31 (2011)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Hancer, E., Xue, B., Karaboga, D., Zhang, M.: A binary ABC algorithm based on advanced similarity scheme for feature selection. Appl. Soft Comput. 36, 334–348 (2015)
John, G.H., Kohavi, R., Pfleger, K., et al.: Irrelevant features and the subset selection problem. In: Machine Learning: Proceedings of the Eleventh International Conference, pp. 121–129 (1994)
Kennedy, J., Eberhart, R.C.: A discrete binary version of the particle swarm algorithm. In: IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, vol. 5, pp. 4104–4108 (1997)
Lane, M.C., Xue, B., Liu, I., Zhang, M.: Gaussian based particle swarm optimisation and statistical clustering for feature selection. In: Blum, C., Ochoa, G. (eds.) EvoCOP 2014. LNCS, vol. 8600, pp. 133–144. Springer, Heidelberg (2014)
Li, Z., Shang, Z., Qu, B., Liang, J.: Feature selection based on manifold-learning with dynamic constraint handling differential evolution. In: IEEE Congress on Evolutionary Computation (CEC), pp. 332–337 (2014)
Liu, H., Motoda, H.: Instance Selection and Construction for Data Mining, vol. 608. Springer Science & Business Media, US (2013)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005)
Qin, A., Huang, V., Suganthan, P.: Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE Trans. Knowl. Data Eng. 13(2), 398–417 (2009)
Storn, R.: On the usage of differential evolution for function optimization. In: 1996 Biennial Conference of the North American Fuzzy Information Processing Society, pp. 519–523. IEEE (1996)
Tsai, C.F., Chen, Z.Y.: Towards high dimensional instance selection: an evolutionary approach. Decision Support Syst. 61, 79–92 (2014)
Tsai, C.F., Eberle, W., Chu, C.Y.: Genetic algorithms in feature and instance selection. Knowl. Based Syst. 39, 240–247 (2013)
Unler, A., Murat, A.: A discrete particle swarm optimization method for feature selection in binary classification problems. Eur. J. Oper. Res. 206(3), 528–539 (2010)
Xue, B., Zhang, M., Browne, W., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. PP(99) (2015). doi:10.1109/TEVC.2015.2504420
Xue, B., Fu, W., Zhang, M.: Multi-objective feature selection in classification: a differential evolution approach. In: Dick, G., Browne, W.N., Whigham, P., Zhang, M., Bui, L.T., Ishibuchi, H., Jin, Y., Li, X., Shi, Y., Singh, P., Tan, K.C., Tang, K. (eds.) SEAL 2014. LNCS, vol. 8886, pp. 516–528. Springer, Heidelberg (2014)
Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans. Cybern. 43(6), 1656–1671 (2013)
Xue, B., Zhang, M., Browne, W.N.: A comprehensive comparison on evolutionary feature selection approaches to classification. Int. J. Comput. Intell. Appl. 14(02), 1550008 (2015)
Yang, Z., Tang, K., Yao, X.: Scalability of generalized adaptive differential evolution for large-scale continuous optimization. Soft Comput. 15, 2141–2155 (2011)
Zhu, P., Zuo, W., Zhang, L., Hu, Q., Shiu, S.C.: Unsupervised feature selection by regularized self-representation. Pattern Recogn. 48(2), 438–446 (2015)
Zhu, Z., Ong, Y.S., Dash, M.: Wrapper-filter feature selection algorithm using a memetic framework. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 37(1), 70–76 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, J., Xue, B., Gao, X., Zhang, M. (2016). A Differential Evolution Approach to Feature Selection and Instance Selection. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-42911-3_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42910-6
Online ISBN: 978-3-319-42911-3
eBook Packages: Computer ScienceComputer Science (R0)