Skip to main content

Feature Selection in Text Mining

  • Reference work entry
Encyclopedia of Machine Learning

Synonyms

Dimensionality reduction on text via feature selection

Definition

The term feature selection is used in machine learning for the process of selecting a subset of features (dimensions) used to represent the data (see Feature Selection, and Dimensionality Reduction). Feature selection can be seen as a part of data pre-processing potentially followed or coupled with feature construction Feature Construction in Text Mining, but can also be coupled with the learning phase if embedded in the learning algorithm. An Assumption of feature selection is that we have defined an original feature space that can be used to represent the data, and our goal is to reduce its dimensionality by selecting a subset of original features. The original feature space of the data is then mapped onto a new feature space. Feature selection in text mining is addressed here separately due to the specificity of textual data compared to the data commonly addressed in machine learning.

Motivation and Background

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  • Apte, C., Damerau, F., & Weiss, S. M. (1994). Toward language independent automated learning of text categorization models. In Proceedings of the 17th annual International ACM SIGIR conference on research and development in Information Retrieval, pp. 23–30, Dublin, Ireland, 1994.

    Google Scholar 

  • Brank, J., Grobelnik, M., Milič-Frayling, N., & Mladenić, D. (2002). Feature selection using support vector machines. In A. Zanasi (Ed.), Data mining III (pp. 261–273). Southampton, UK: WIT.

    Google Scholar 

  • Bi, J., Bennett, K. P., Embrechts, M., Breneman, C. M., & Song, M. (2003). Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research, 3, 1229–1243.

    Article  MATH  Google Scholar 

  • Bekkerman, R., El-Yaniv, R., Tishby, N., & Winter, Y. (2003). Distributional word clusters vs. words for text categorization. Journal of Machine Learning Research, 3, 1183–1208.

    Article  MATH  Google Scholar 

  • Chakrabarti, S., Dom, B., Agrawal, R., & Raghavan, P. (1998). Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies. The VLDB Journal, 7, 163–178.

    Article  Google Scholar 

  • Dhillon, I., Mallela, S., & Kumar, R. (2003). A divisive information-theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research, 3, 1265–1287.

    Article  MATH  Google Scholar 

  • Forman, G. (2003). An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 3, 1289–1305.

    Article  MATH  Google Scholar 

  • Globerson, A., & Tishby, N. (2003). Sufficient dimensionality reduction. Journal of Machine Learning Research, 3, 1307–1331.

    Article  MATH  Google Scholar 

  • Koller, D., & Sahami, M. (1997). Hierarchically classifying documents using very few words. In Proceedings of the 14th international conference on machine learning ICML’97 (pp. 170–178). Nashrille, TN.

    Google Scholar 

  • Lewis, D. D., & Ringuette, M. (1994). Comparison of two learning algorithms for text categorization. In Proceedings of the 3rd annual symposium on document analysis and information retrieval SDAIR-1994. Las Vegas, NV.

    Google Scholar 

  • Mladenić, D. (1998). Feature subset selection in text-learning. In Proceedings of the 10th European conference on machine learning ECML’98. Chemnitz, Germany.

    Google Scholar 

  • Mladenić, D. (2006). Feature selection for dimensionality reduction. In C. Saunders, S. Gunn, J. Shawe-Taylor, & M. Grobelink (Eds.), Subspace, Latent Structure and Feature Selection: Statistical and Optimization Perspectives Workshop: Lecture notes in computer science (Vol. 3940, pp. 84–102). Berlin, Heidelberg: Springer.

    Google Scholar 

  • Mladenić, D., & Grobelnik, M. (2003). Feature selection on hierarchy of web documents. Journal of Decision Support Systems, 35, 45–87.

    Article  Google Scholar 

  • Quinlan, J. R. (1993). Constructing decision tree. In C4.5: Programs for machine learning. San Francisco: Morgan Kaufman Publishers.

    Google Scholar 

  • Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Proceedings of the 14th international conference on machine learning ICML’97 (pp. 412–420). Las Vegas, NV.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this entry

Cite this entry

Mladenić, D. (2011). Feature Selection in Text Mining. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_307

Download citation

Publish with us

Policies and ethics