Synonyms
Definition
Stemming is a process by which word endings or other affixes are removed or modified in order that word forms which differ in non-relevant ways may be merged and treated as equivalent. A computer program which performs such a transformation is referred to as a stemmer or stemming algorithm. The output of a stemming algorithm is known as a stem.
Historical Background
The need for stemming first arose in the field of information retrieval (IR), where queries containing search terms need to be matched against document surrogates containing index terms. With the development of computer-based systems for IR, the problem immediately arose that a small difference in form between a search term and an index term could result in a failure to retrieve some relevant documents. Thus, if a query used the term “explosion” and a document was indexed by the term “explosives,” there would be no match on this term (whether or...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Adamson G.W. and Boreham J. The use of an association measure based on character structure to identify semantically related pairs of words and document titles. Inf. Process. Manage., 10(7/8):253–260, 1974.
Ahmad F., Yusoff M., and Sembok M.T. Experiments with a stemming algorithm for Malay words. J. Am. Soc. Inf. Sci. Technol., 47(12):909–918, 1996.
Al-Sughaiyer I.A. and Al-Kharashi I.A. Arabic morphological analysis techniques: a comprehensive survey. J. Am. Soc. Inf. Sci. Technol., 55(3):189–213, 2004.
Aljlayl M. and Frieder O. On arabic search: Improving the retrieval effectiveness via a light stemming approach. In Proc. Int. Conf. on Information and Knowledge Management, 2002, pp. 340–347.
Bacchin M., Ferro N., and Melluci M. A probabilistic model for stemmer generation. Inf. Process. Manage., 41(1):121–137, 2005.
Frakes W.B. and Fox C.J. Strength and similarity of affix removal stemming algorithms. SIGIR Forum, 37(1):26–30, 2003 (Spring 2003).
Harman D. How effective is suffixing? J. Am. Soc. Inf. Sci., 42(1):7–15, 1991.
Hull D. A Stemming algorithms: a case study for detailed evaluation. J. Am. Soc. Inf. Sci., 47(1):70–84, 1996.
Krovetz R. Viewing morphology as an inference process. Artificial Intelligence, 118(1/2):277–294, 2000.
Lennon M., Pierce D.S., Tarry B.D., and Willett P. An evaluation of some conflation algorithms for information retrieval. J. Inf. Sci., 3:177–183, 1981.
Lovins J.B. Development of a stemming algorithm. Mech. Transl. Comput. Linguist., 11:22–31, 1968.
Paice C.D. Another stemmer. SIGIR Forum, 24(3):56–61, 1990.
Paice C.D. A method for the evaluation of stemming algorithms based on error counting. J. Am. Soc. Inf. Sci., 47(8):632–649, 1996.
Porter M.F. An algorithm for suffix stripping. Program, 14(3):130–137, 1980.
Xu J. and Croft W.B. Corpus-based stemming using coocurrence of word variants. ACM Trans. Inf. Syst., 16(1):61–81, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this entry
Cite this entry
Paice, C.D. (2009). Stemming. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_942
Download citation
DOI: https://doi.org/10.1007/978-0-387-39940-9_942
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-35544-3
Online ISBN: 978-0-387-39940-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering