Article

Word sense disambiguation in queries

Authors:
Shuang Liu

University of Illinois at Chicago, Chicago, IL

University of Illinois at Chicago, Chicago, IL
View Profile

,
Clement Yu

University of Illinois at Chicago, Chicago, IL

University of Illinois at Chicago, Chicago, IL
View Profile

,
Weiyi Meng

Binghamton University, Binghamton, NY

Binghamton University, Binghamton, NY
View Profile

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge managementOctober 2005Pages 525–532https://doi.org/10.1145/1099554.1099696

Published:31 October 2005Publication History

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

Pages 525–532

ABSTRACT

This paper presents a new approach to determine the senses of words in queries by using WordNet. In our approach, noun phrases in a query are determined first. For each word in the query, information associated with it, including its synonyms, hyponyms, hypernyms, definitions of its synonyms and hyponyms, and its domains, can be used for word sense disambiguation. By comparing these pieces of information associated with the words which form a phrase, it may be possible to assign senses to these words. If the above disambiguation fails, then other query words, if exist, are used, by going through exactly the same process. If the sense of a query word cannot be determined in this manner, then a guess of the sense of the word is made, if the guess has at least 50% chance of being correct. If no sense of the word has 50% or higher chance of being used, then we apply a Web search to assist in the word sense disambiguation process. Experimental results show that our approach has 100% applicability and 90% accuracy on the most recent robust track of TREC collection of 250 queries. We combine this disambiguation algorithm to our retrieval system to examine the effect of word sense disambiguation in text retrieval. Experimental results show that the disambiguation algorithm together with other components of our retrieval system yield a result which is 13.7% above that produced by the same system but without the disambiguation, and 9.2% above that produced by using Lesk's algorithm. Our retrieval effectiveness is 7% better than the best reported result in the literature.

References

Ricardo Baeza-Yates, Berthier Ribeiro-Neto: Modern Information Retrieval, Addison-Wesley, 1999. Google ScholarDigital Library
Daniel M. Bikel, Scott Miller, Richard L. Schwartz, Ralph M. Weischedel: Nymble: a High-Performance Learning Name-finder. ANLP 1997: 194--201 Google ScholarDigital Library
Brill Tagger: http://www.cs.jhu.edu/~brill/Google Scholar
James P. Callan, Teruko Mitamura: Knowledge-based extraction of named entities. CIKM 2002: 532--537 Google ScholarDigital Library
Nancy Chinchor: "Overview of MUC-7", MUC-7, (1998)Google Scholar
Julio Gonzalo, Felisa Verdejo, Irina Chugur, Juan M. Cigarran: Indexing with WordNet synsets can improve Text Retrieval CoRR cmp-lg/9808002: (1998)Google Scholar
Daniel Jurafsky, James H. Martin: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Prentice-Hall, 2000 Google ScholarDigital Library
Sang-Bum Kim, Hee-Cheol Seo, Hae-Chang Rim: Information retrieval using word senses: root sense tagging approach. SIGIR 2004: 258--265 Google ScholarDigital Library
K. Kwok, L. Grunfeld, N. Dinstl, P. Deng, TREC 2003 Robust, HARD, and QA Track Experiments using PIRCS, TREC12, 2003.Google Scholar
K.L. Kwok, L. Grunfeld, H.L. Sun, P. Deng, TREC 2004 Robust Track Experiments Using PIRCS, TREC13, 2004Google Scholar
Shuang Liu, Fang Liu, Clement Yu, Weiyi Meng: An effective approach to document retrieval via utilizing WordNet and recognizing phrases. SIGIR 2004: 266--272 Google ScholarDigital Library
Shuang Liu, Chaojing Sun, Clement Yu: UIC at TREC 2004: Robust Track. TREC13, 2004Google Scholar
Michael Lesk: Automatic Sense Disambiguation Using Machine Readable Dictionaries: how to tell a pine cone from an ice cream cone. ACM SIGDOC, 1986. Google ScholarDigital Library
Christopher D. Manning, Hinrich Schütze: Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999. Google ScholarDigital Library
Rada Mihalcea, Paul Tarau, Elizabeth Figa: PageRank on Semantic Networks, with application to Word Sense Disambiguation, COLING 2004, Switzerland, Geneva, 2004 Google ScholarDigital Library
Rada Mihalcea: Word Sense Disambiguation Using Pattern Learning and Automatic Feature Selection. Journal of Natural Language and Engineering, 2002. Google ScholarDigital Library
George A. Miller. Special Issue. WordNet: An On-line Lexical Database, International Journal of Lexicography, 1990.Google ScholarCross Ref
George A. Miller, Claudia Leacock, Randee I. Tengi, R. Bunker: A Semantic Concordance. 3 DARPA Workshop on Human Language Technology, p303--308, 1993. Google ScholarDigital Library
Siddharth Patwardhan, Satanjeev Banerjee, Ted Pedersen: Using Measures of Semantic Relatedness for Word Sense Disambiguation. CICLing 2003: 241--257 Google ScholarDigital Library
R. Richardson, A. Smeaton: Using WordNet in a knowledge-based approach to information retrieval. BCS-IRSG Colloquium on Information Retrieval, 1995Google Scholar
Mark Sanderson: Word Sense Disambiguation and Information Retrieval, ACM SIGIR, 1994 Google ScholarDigital Library
Hinrich Schütze, Jan O. Pedersen: Information retrieval based on word senses. In Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval, pages 161--175, Las Vegas, NV, 1995Google Scholar
Hinrich Schütze: Automatic Word Sense Discrimination. Computational Linguistics 24(1): 97--123 (1998) Google ScholarDigital Library
C. Sun, S. Liu, F. Liu, C. Yu, W. Meng, Recognition and Classification of Noun Phrases in Queries for Effective Retrieval, Technique Report, UIC, 2005,Google Scholar
Christopher Stokoe, Michael P. Oakes, John Tait: Word sense disambiguation in information retrieval revisited. SIGIR 2003: 159--166 Google ScholarDigital Library
Xiang Tong, ChengXiang Zhai, Natasa Milic-Frayling, David A. Evans: Evaluation of Syntactic Phrase Indexing -- CLARIT NLP Track Report. TREC 1996Google Scholar
Ellen M. Voorhees: Using WordNet to Disambiguate Word Senses for Text Retrieval. SIGIR 1993: 171--180 Google ScholarDigital Library
Ellen M. Voorhees: Query Expansion Using Lexical-Semantic Relations. SIGIR 1994: 61--69 Google ScholarDigital Library
Ellen M. Voorhees: Overview of the TREC 2004 Robust Retrieval Track, TREC13, 2004.Google Scholar
David Yarowsky: Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. ACL 1995: 189--196 Google ScholarDigital Library
D.L. Yeung, C.L.A. Clarke, G.V. Cormack, T.R. Lynam, E.L. Terra, Task-Specific Query Expansion (MultiText Experiments for TREC 2003), TREC12, 2003.Google Scholar
Clement Yu, Weiyi Meng: Principles of database query processing for advanced applications. San Francisco, Morgan Kaufmann, 1998. Google ScholarDigital Library

Index Terms

Word sense disambiguation in queries
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
2. Information systems
  1. Information retrieval

Recommendations

A Word Sense Disambiguation Technique for Sinhala
ICAIET '14: Proceedings of the 2014 4th International Conference on Artificial Intelligence with Applications in Engineering and Technology

Word sense disambiguation is the task of identifying the implied sense of a polysemous word in a given context. There have been many efforts on word sense disambiguation for English, but the amount of efforts for Sinhala is very little. This paper ...
Read More
Cross-lingual word sense disambiguation for languages with scarce resources
Canadian AI'11: Proceedings of the 24th Canadian conference on Advances in artificial intelligence

Word Sense Disambiguation has long been a central problem in computational linguistics. Word Sense Disambiguation is the ability to identify the meaning of words in context in a computational manner. Statistical and supervised approaches require a large ...
Read More
Word sense disambiguation: a case study on the granularity of sense distinctions
ISPRA'05: Proceedings of the 4th WSEAS International Conference on Signal Processing, Robotics and Automation

The paper presents a method for word sense disambiguation (WSD) based on parallel corpora. The method exploits recent advances in word alignment and word clustering based on automatic extraction of translation equivalents and is supported by a lexical ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management
October 2005
854 pages
ISBN:1595931406
DOI:10.1145/1099554
General Chair:
Otthein Herzog
University of Bremen, Germany
,
Program Chairs:
Hans-Jörg Schek
University for Health Sciences, Medical Informatics and Technology, Austria
,
Norbert Fuhr
University of Duisburg-Essen, Germany
,
Abdur Chowdhury
America Online, USA
,
Wilfried Teiken
IBM T.J. Watson Research Center, USA
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 October 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
WordNet
information retrieval
word sense disambiguation
Qualifiers
- Article
Conference

Acceptance Rates
CIKM '05 Paper Acceptance Rate77of425submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 53
  Total Citations
  View Citations
- 948
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Word sense disambiguation in queries

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Word Sense Disambiguation Technique for Sinhala

Cross-lingual word sense disambiguation for languages with scarce resources

Word sense disambiguation: a case study on the granularity of sense distinctions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Word sense disambiguation in queries

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Word Sense Disambiguation Technique for Sinhala

Cross-lingual word sense disambiguation for languages with scarce resources

Word sense disambiguation: a case study on the granularity of sense distinctions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media