Definition (or Synopsis)
Feature selection, as a dimensionality reduction technique, aims to choose a small subset of the relevant features from the original ones by removing irrelevant, redundant, or noisy features. Feature selection usually leads to better learning performance, i.e., higher learning accuracy, lower computational cost, and better model interpretability.
Generally speaking, irrelevant features are features that cannot help discriminate samples from different classes(supervised) or clusters(unsupervised). Removing irrelevant features will not affect learning performance. In fact, the removal of irrelevant features may help learn a better model, as irrelevant features may confuse the learning system and cause memory and computation inefficiency. For example, in Fig. 1a, f1 is a relevant feature because f1 can discriminate class1 and class2. In Fig. 1b, f2 is a redundant feature because f2cannot...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Alelyani S, Tang J, Liu H (2013) Feature selection for clustering: a review. In: Aggarwal CC (ed) Data clustering: algorithms and applications, vol 29. CRC Press, Hoboken
Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5:845–889
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157–1182
Jain A, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE Trans Pattern Anal Mach Intell 19(2):153–158
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324
Koller D, Sahami M (1996) Toward optimal feature selection. Technical report, Stanford InfoLab
Li, J, Cheng K, Wang S, Morstatter F, Trevino R P, Tang J, Liu H (2016) Feature Selection: A Data Perspective. arXiv preprint 1601.07996
Liu H, Motoda H (2007) Computational methods of feature selection. CRC Press, New York
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502
Liu H, Motoda H, Setiono R, Zhao Z (2010) Feature selection: an ever evolving frontier in data mining. In: FSDM, Hyderabad, pp 4–13
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517
Tang J, Liu H (2012) Feature selection with linked data in social media. In: SDM, Anaheim. SIAM, pp 118–128
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. In: Aggarwal CC (ed) Data classification: algorithms and applications. Chapman & Hall/CRC, Boca Raton, p 37
Wu X, Yu K, Ding W, Wang H, Zhu X (2013) Online feature selection with streaming features. IEEE Trans Pattern Anal Mach Intell 35(5):1178–1192
Zhao ZA, Liu H (2011) Spectral feature selection for data mining. Chapman & Hall/CRC, Boca Raton
Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A, Liu H (2010) Advancing feature selection research. ASU feature selection repository, 1–28
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this entry
Cite this entry
Wang, S., Tang, J., Liu, H. (2017). Feature Selection. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_101
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7687-1_101
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering