ABSTRACT
This paper presents a new approach to determine the senses of words in queries by using WordNet. In our approach, noun phrases in a query are determined first. For each word in the query, information associated with it, including its synonyms, hyponyms, hypernyms, definitions of its synonyms and hyponyms, and its domains, can be used for word sense disambiguation. By comparing these pieces of information associated with the words which form a phrase, it may be possible to assign senses to these words. If the above disambiguation fails, then other query words, if exist, are used, by going through exactly the same process. If the sense of a query word cannot be determined in this manner, then a guess of the sense of the word is made, if the guess has at least 50% chance of being correct. If no sense of the word has 50% or higher chance of being used, then we apply a Web search to assist in the word sense disambiguation process. Experimental results show that our approach has 100% applicability and 90% accuracy on the most recent robust track of TREC collection of 250 queries. We combine this disambiguation algorithm to our retrieval system to examine the effect of word sense disambiguation in text retrieval. Experimental results show that the disambiguation algorithm together with other components of our retrieval system yield a result which is 13.7% above that produced by the same system but without the disambiguation, and 9.2% above that produced by using Lesk's algorithm. Our retrieval effectiveness is 7% better than the best reported result in the literature.
- Ricardo Baeza-Yates, Berthier Ribeiro-Neto: Modern Information Retrieval, Addison-Wesley, 1999. Google ScholarDigital Library
- Daniel M. Bikel, Scott Miller, Richard L. Schwartz, Ralph M. Weischedel: Nymble: a High-Performance Learning Name-finder. ANLP 1997: 194--201 Google ScholarDigital Library
- Brill Tagger: http://www.cs.jhu.edu/~brill/Google Scholar
- James P. Callan, Teruko Mitamura: Knowledge-based extraction of named entities. CIKM 2002: 532--537 Google ScholarDigital Library
- Nancy Chinchor: "Overview of MUC-7", MUC-7, (1998)Google Scholar
- Julio Gonzalo, Felisa Verdejo, Irina Chugur, Juan M. Cigarran: Indexing with WordNet synsets can improve Text Retrieval CoRR cmp-lg/9808002: (1998)Google Scholar
- Daniel Jurafsky, James H. Martin: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Prentice-Hall, 2000 Google ScholarDigital Library
- Sang-Bum Kim, Hee-Cheol Seo, Hae-Chang Rim: Information retrieval using word senses: root sense tagging approach. SIGIR 2004: 258--265 Google ScholarDigital Library
- K. Kwok, L. Grunfeld, N. Dinstl, P. Deng, TREC 2003 Robust, HARD, and QA Track Experiments using PIRCS, TREC12, 2003.Google Scholar
- K.L. Kwok, L. Grunfeld, H.L. Sun, P. Deng, TREC 2004 Robust Track Experiments Using PIRCS, TREC13, 2004Google Scholar
- Shuang Liu, Fang Liu, Clement Yu, Weiyi Meng: An effective approach to document retrieval via utilizing WordNet and recognizing phrases. SIGIR 2004: 266--272 Google ScholarDigital Library
- Shuang Liu, Chaojing Sun, Clement Yu: UIC at TREC 2004: Robust Track. TREC13, 2004Google Scholar
- Michael Lesk: Automatic Sense Disambiguation Using Machine Readable Dictionaries: how to tell a pine cone from an ice cream cone. ACM SIGDOC, 1986. Google ScholarDigital Library
- Christopher D. Manning, Hinrich Schütze: Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999. Google ScholarDigital Library
- Rada Mihalcea, Paul Tarau, Elizabeth Figa: PageRank on Semantic Networks, with application to Word Sense Disambiguation, COLING 2004, Switzerland, Geneva, 2004 Google ScholarDigital Library
- Rada Mihalcea: Word Sense Disambiguation Using Pattern Learning and Automatic Feature Selection. Journal of Natural Language and Engineering, 2002. Google ScholarDigital Library
- George A. Miller. Special Issue. WordNet: An On-line Lexical Database, International Journal of Lexicography, 1990.Google ScholarCross Ref
- George A. Miller, Claudia Leacock, Randee I. Tengi, R. Bunker: A Semantic Concordance. 3 DARPA Workshop on Human Language Technology, p303--308, 1993. Google ScholarDigital Library
- Siddharth Patwardhan, Satanjeev Banerjee, Ted Pedersen: Using Measures of Semantic Relatedness for Word Sense Disambiguation. CICLing 2003: 241--257 Google ScholarDigital Library
- R. Richardson, A. Smeaton: Using WordNet in a knowledge-based approach to information retrieval. BCS-IRSG Colloquium on Information Retrieval, 1995Google Scholar
- Mark Sanderson: Word Sense Disambiguation and Information Retrieval, ACM SIGIR, 1994 Google ScholarDigital Library
- Hinrich Schütze, Jan O. Pedersen: Information retrieval based on word senses. In Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval, pages 161--175, Las Vegas, NV, 1995Google Scholar
- Hinrich Schütze: Automatic Word Sense Discrimination. Computational Linguistics 24(1): 97--123 (1998) Google ScholarDigital Library
- C. Sun, S. Liu, F. Liu, C. Yu, W. Meng, Recognition and Classification of Noun Phrases in Queries for Effective Retrieval, Technique Report, UIC, 2005,Google Scholar
- Christopher Stokoe, Michael P. Oakes, John Tait: Word sense disambiguation in information retrieval revisited. SIGIR 2003: 159--166 Google ScholarDigital Library
- Xiang Tong, ChengXiang Zhai, Natasa Milic-Frayling, David A. Evans: Evaluation of Syntactic Phrase Indexing -- CLARIT NLP Track Report. TREC 1996Google Scholar
- Ellen M. Voorhees: Using WordNet to Disambiguate Word Senses for Text Retrieval. SIGIR 1993: 171--180 Google ScholarDigital Library
- Ellen M. Voorhees: Query Expansion Using Lexical-Semantic Relations. SIGIR 1994: 61--69 Google ScholarDigital Library
- Ellen M. Voorhees: Overview of the TREC 2004 Robust Retrieval Track, TREC13, 2004.Google Scholar
- David Yarowsky: Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. ACL 1995: 189--196 Google ScholarDigital Library
- D.L. Yeung, C.L.A. Clarke, G.V. Cormack, T.R. Lynam, E.L. Terra, Task-Specific Query Expansion (MultiText Experiments for TREC 2003), TREC12, 2003.Google Scholar
- Clement Yu, Weiyi Meng: Principles of database query processing for advanced applications. San Francisco, Morgan Kaufmann, 1998. Google ScholarDigital Library
Index Terms
- Word sense disambiguation in queries
Recommendations
A Word Sense Disambiguation Technique for Sinhala
ICAIET '14: Proceedings of the 2014 4th International Conference on Artificial Intelligence with Applications in Engineering and TechnologyWord sense disambiguation is the task of identifying the implied sense of a polysemous word in a given context. There have been many efforts on word sense disambiguation for English, but the amount of efforts for Sinhala is very little. This paper ...
Cross-lingual word sense disambiguation for languages with scarce resources
Canadian AI'11: Proceedings of the 24th Canadian conference on Advances in artificial intelligenceWord Sense Disambiguation has long been a central problem in computational linguistics. Word Sense Disambiguation is the ability to identify the meaning of words in context in a computational manner. Statistical and supervised approaches require a large ...
Word sense disambiguation: a case study on the granularity of sense distinctions
ISPRA'05: Proceedings of the 4th WSEAS International Conference on Signal Processing, Robotics and AutomationThe paper presents a method for word sense disambiguation (WSD) based on parallel corpora. The method exploits recent advances in word alignment and word clustering based on automatic extraction of translation equivalents and is supported by a lexical ...
Comments