Definition
Feature selection is the study of algorithms for reducing dimensionality of data to improve machine learning performance. For a dataset with N features and M dimensions (or features, attributes), feature selection aims to reduce M to M′ and M′ ≤ M. It is an important and widely used approach to dimensionality reduction. Another effective approach is feature extraction . One of the key distinctions of the two approaches lies at their outcomes. Assuming we have four features F 1, F 2, F 3, F 4, if both approaches result in 2 features, the 2 selected features are a subset of 4 original features (say, F 1, F 3), but the 2 extracted features are some combination of 4 original features (e.g., F 1′ = ∑ a i F i and F 2′ = ∑ b i F i , where a i , b i are some constants). Feature selection is commonly used in applications where original features need to be...
Recommended Reading
Dy, J. G., & Brodley, C. E. (2004). Feature selection for unsupervised learning. Journal of Machine Learning Research, 5, 845–889.
Forman, G. (2007). Feature selection for text classification. In H. Liu & Motoda, H. (Eds.), Computational methods of feature selection. Boca Raton, FL: Chapman and Hall/CRC Press.
Guyon, I., Aliferis, C., & Elisseeff, A. (2007). Causal feature selection. In (LM07), A longer technical report is available: http://clopinet.com/isabelle/Papers/causalFS.
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1–2), 273–324.
Liu, H., & Motoda, H. (1998). Feature Selection for knowledge discovery & data mining. Boston, MA: Kluwer Academic Publishers.
Liu, H., & Motoda, H. (Eds.). (2007). Computational methods of feature selection. Boca Raton, FL: Chapman and Hall/CRC Press.
Liu, H., & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(3), 1–12.
Parson, L., Haque, E., & Liu, H. (2004). Subspace clustering for high dimensional data – a review. ACM SIGKDD Explorations Newsletter Archive special issue on learning from imbalanced datasets, 6(1): 90–105. ISSN: 1931–0145
Robnik-Sikonja, M., & Kononenko, I. (2003). Theoretical and empirical analysis of Relief and ReliefF. Machine Learning, 53, 23–69.
Singhi, S., & Liu, H. (2006). Feature subset selection bias for classification learning. In Proceeding of the 23rd international conference on machine learning ACM international conference proceeding series, (Vol. 148, pp. 849–856). Pittsburg, PA. ISBN: 1-59593-383-2
Yu, L., & Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5(October), 1205–1224.
Zhao, Z., & Liu, H. (2007a). Searching for interacting features. In Proceedings of the 20th international joint conference on artificial intelligence, (pp. 1156–1161). Hydrabad, India.
Zhao, Z., & Liu, H. (2007b). Semi-supervised feature selection via spectral analysis. In Proceedings of 24th international conference on machine learning (1151–1157). Corvalis, OR. ISBN: 978-1-59593-793-3
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this entry
Cite this entry
Liu, H. (2011). Feature Selection. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_306
Download citation
DOI: https://doi.org/10.1007/978-0-387-30164-8_306
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30768-8
Online ISBN: 978-0-387-30164-8
eBook Packages: Computer ScienceReference Module Computer Science and Engineering