skip to main content
10.1145/1099554.1099696acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Word sense disambiguation in queries

Published:31 October 2005Publication History

ABSTRACT

This paper presents a new approach to determine the senses of words in queries by using WordNet. In our approach, noun phrases in a query are determined first. For each word in the query, information associated with it, including its synonyms, hyponyms, hypernyms, definitions of its synonyms and hyponyms, and its domains, can be used for word sense disambiguation. By comparing these pieces of information associated with the words which form a phrase, it may be possible to assign senses to these words. If the above disambiguation fails, then other query words, if exist, are used, by going through exactly the same process. If the sense of a query word cannot be determined in this manner, then a guess of the sense of the word is made, if the guess has at least 50% chance of being correct. If no sense of the word has 50% or higher chance of being used, then we apply a Web search to assist in the word sense disambiguation process. Experimental results show that our approach has 100% applicability and 90% accuracy on the most recent robust track of TREC collection of 250 queries. We combine this disambiguation algorithm to our retrieval system to examine the effect of word sense disambiguation in text retrieval. Experimental results show that the disambiguation algorithm together with other components of our retrieval system yield a result which is 13.7% above that produced by the same system but without the disambiguation, and 9.2% above that produced by using Lesk's algorithm. Our retrieval effectiveness is 7% better than the best reported result in the literature.

References

  1. Ricardo Baeza-Yates, Berthier Ribeiro-Neto: Modern Information Retrieval, Addison-Wesley, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Daniel M. Bikel, Scott Miller, Richard L. Schwartz, Ralph M. Weischedel: Nymble: a High-Performance Learning Name-finder. ANLP 1997: 194--201 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Brill Tagger: http://www.cs.jhu.edu/~brill/Google ScholarGoogle Scholar
  4. James P. Callan, Teruko Mitamura: Knowledge-based extraction of named entities. CIKM 2002: 532--537 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nancy Chinchor: "Overview of MUC-7", MUC-7, (1998)Google ScholarGoogle Scholar
  6. Julio Gonzalo, Felisa Verdejo, Irina Chugur, Juan M. Cigarran: Indexing with WordNet synsets can improve Text Retrieval CoRR cmp-lg/9808002: (1998)Google ScholarGoogle Scholar
  7. Daniel Jurafsky, James H. Martin: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Prentice-Hall, 2000 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sang-Bum Kim, Hee-Cheol Seo, Hae-Chang Rim: Information retrieval using word senses: root sense tagging approach. SIGIR 2004: 258--265 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Kwok, L. Grunfeld, N. Dinstl, P. Deng, TREC 2003 Robust, HARD, and QA Track Experiments using PIRCS, TREC12, 2003.Google ScholarGoogle Scholar
  10. K.L. Kwok, L. Grunfeld, H.L. Sun, P. Deng, TREC 2004 Robust Track Experiments Using PIRCS, TREC13, 2004Google ScholarGoogle Scholar
  11. Shuang Liu, Fang Liu, Clement Yu, Weiyi Meng: An effective approach to document retrieval via utilizing WordNet and recognizing phrases. SIGIR 2004: 266--272 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Shuang Liu, Chaojing Sun, Clement Yu: UIC at TREC 2004: Robust Track. TREC13, 2004Google ScholarGoogle Scholar
  13. Michael Lesk: Automatic Sense Disambiguation Using Machine Readable Dictionaries: how to tell a pine cone from an ice cream cone. ACM SIGDOC, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Christopher D. Manning, Hinrich Schütze: Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Rada Mihalcea, Paul Tarau, Elizabeth Figa: PageRank on Semantic Networks, with application to Word Sense Disambiguation, COLING 2004, Switzerland, Geneva, 2004 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rada Mihalcea: Word Sense Disambiguation Using Pattern Learning and Automatic Feature Selection. Journal of Natural Language and Engineering, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. George A. Miller. Special Issue. WordNet: An On-line Lexical Database, International Journal of Lexicography, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  18. George A. Miller, Claudia Leacock, Randee I. Tengi, R. Bunker: A Semantic Concordance. 3 DARPA Workshop on Human Language Technology, p303--308, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Siddharth Patwardhan, Satanjeev Banerjee, Ted Pedersen: Using Measures of Semantic Relatedness for Word Sense Disambiguation. CICLing 2003: 241--257 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Richardson, A. Smeaton: Using WordNet in a knowledge-based approach to information retrieval. BCS-IRSG Colloquium on Information Retrieval, 1995Google ScholarGoogle Scholar
  21. Mark Sanderson: Word Sense Disambiguation and Information Retrieval, ACM SIGIR, 1994 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Hinrich Schütze, Jan O. Pedersen: Information retrieval based on word senses. In Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval, pages 161--175, Las Vegas, NV, 1995Google ScholarGoogle Scholar
  23. Hinrich Schütze: Automatic Word Sense Discrimination. Computational Linguistics 24(1): 97--123 (1998) Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Sun, S. Liu, F. Liu, C. Yu, W. Meng, Recognition and Classification of Noun Phrases in Queries for Effective Retrieval, Technique Report, UIC, 2005,Google ScholarGoogle Scholar
  25. Christopher Stokoe, Michael P. Oakes, John Tait: Word sense disambiguation in information retrieval revisited. SIGIR 2003: 159--166 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Xiang Tong, ChengXiang Zhai, Natasa Milic-Frayling, David A. Evans: Evaluation of Syntactic Phrase Indexing -- CLARIT NLP Track Report. TREC 1996Google ScholarGoogle Scholar
  27. Ellen M. Voorhees: Using WordNet to Disambiguate Word Senses for Text Retrieval. SIGIR 1993: 171--180 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ellen M. Voorhees: Query Expansion Using Lexical-Semantic Relations. SIGIR 1994: 61--69 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ellen M. Voorhees: Overview of the TREC 2004 Robust Retrieval Track, TREC13, 2004.Google ScholarGoogle Scholar
  30. David Yarowsky: Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. ACL 1995: 189--196 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D.L. Yeung, C.L.A. Clarke, G.V. Cormack, T.R. Lynam, E.L. Terra, Task-Specific Query Expansion (MultiText Experiments for TREC 2003), TREC12, 2003.Google ScholarGoogle Scholar
  32. Clement Yu, Weiyi Meng: Principles of database query processing for advanced applications. San Francisco, Morgan Kaufmann, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Word sense disambiguation in queries

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management
                October 2005
                854 pages
                ISBN:1595931406
                DOI:10.1145/1099554

                Copyright © 2005 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 31 October 2005

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • Article

                Acceptance Rates

                CIKM '05 Paper Acceptance Rate77of425submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%

                Upcoming Conference

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader