Article

Information retrieval using word senses: root sense tagging approach

Authors:
Sang-Bum Kim

Korea University, Seoul, Korea

Korea University, Seoul, Korea
View Profile

,
Hee-Cheol Seo

Korea University, Seoul, Korea

Korea University, Seoul, Korea
View Profile

,
Hae-Chang Rim

Korea University, Seoul, Korea

Korea University, Seoul, Korea
View Profile

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrievalJuly 2004Pages 258–265https://doi.org/10.1145/1008992.1009038

Published:25 July 2004Publication History

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 258–265

ABSTRACT

Information retrieval using word senses is emerging as a good research challenge on semantic information retrieval. In this paper, we propose a new method using word senses in information retrieval: root sense tagging method. This method assigns coarse-grained word senses defined in WordNet to query terms and document terms by unsupervised way using co-occurrence information constructed automatically. Our sense tagger is crude, but performs consistent disambiguation by considering only the single most informative word as evidence to disambiguate the target word. We also allow multiple-sense assignment to alleviate the problem caused by incorrect disambiguation.Experimental results on a large-scale TREC collection show that our approach to improve retrieval effectiveness is successful, while most of the previous work failed to improve performances even on small text collection. Our method also shows promising results when is combined with pseudo relevance feedback and state-of-the-art retrieval function such as BM25.

References

S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6):391--407, 1990.Google ScholarCross Ref
T. Hofmann. Probabilistic Latent Semantic Indexing. In Proceedings of the 22nd Annual ACM Conference on Research and Development in Information Retrieval pages 50--57, Berkeley, California, August 1999. Google ScholarDigital Library
K. S. Jones, S. Walker, and S. E. Robertson. A probabilistic model of information retrieval: development and comparative experiments -part 1. Information Processing and Management 36(6):779--808, 2000. Google ScholarDigital Library
R. Krovetz and W. B. Croft. Lexical ambiguity and information retrieval. Information Systems 10(2):115--141, 1992. Google ScholarDigital Library
C. D. Manning and H. Schutze. Foundations of statistical natural language processing MIT Press, 1999. Google ScholarDigital Library
S. E. Robertson and S. Walker. Okapi/keenbow at trec-8. In Proceedings of TREC-8, 8th Text Retrieval Conference pages 151--161, Gaithersburg, US, 2000.Google Scholar
M. Sanderson. Retrieving with good sense. Inf. Retr., 2(1):49--69, 2000. Google ScholarDigital Library
M. Sanderson and C. J. V. Rijsbergen. The impact on retrieval effectiveness of skewed frequency distributions. ACM Transactions on Information Systems 17(4):440--465, 1999. Google ScholarDigital Library
H.Schutze and J. Pedersen. Information retrieval based on word senses. In Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval pages 161--175, 1995.Google Scholar
C. Stokoe, M. P. Oakes, and J. Tait. Word sense disambiguation in information retrieval revisited. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval pages 159--166. ACM Press, 2003. Google ScholarDigital Library
E. M. Voorhees. Using wordnet to disambiguate word senses for text retrieval. In Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval pages 171--180. ACM Press, 1993. Google ScholarDigital Library
P. Wallis. Information retrieval based on paraphrase. In Proceedings of the 1st Pacific Association for Computational Linguistics Conference 1993.Google Scholar
D. Yarowsky. Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of 33rd Annual Meeting of the Association for Computational Linguistics pages 189--196, 1995. Google ScholarDigital Library

Index Terms

Information retrieval using word senses: root sense tagging approach
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval

Recommendations

Word sense disambiguation in information retrieval revisited
SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval

Word sense ambiguity is recognized as having a detrimental effect on the precision of information retrieval systems in general and web search systems in particular, due to the sparse nature of the queries involved. Despite continued research into the ...
Read More
Word sense disambiguation in queries
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

This paper presents a new approach to determine the senses of words in queries by using WordNet. In our approach, noun phrases in a query are determined first. For each word in the query, information associated with it, including its synonyms, hyponyms, ...
Read More
An effective approach to document retrieval via utilizing WordNet and recognizing phrases
SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

Noun phrases in queries are identified and classified into four types: proper names, dictionary phrases, simple phrases and complex phrases. A document has a phrase if all content words in the phrase are within a window of a certain size. The window ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
July 2004
624 pages
ISBN:1581138814
DOI:10.1145/1008992
General Chair:
Mark Sanderson
University of Sheffield (UK)
,
Program Chairs:
Kalervo Järvelin
University of Tampere (Finland)
,
James Allan
University of Massachusetts (USA)
,
Peter Bruza
Distributed Systems Technology Centre (Australia)
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
WordNet
information retrieval
performance evaluation
word sense disambiguation
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 46
  Total Citations
  View Citations
- 736
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Information retrieval using word senses: root sense tagging approach

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Word sense disambiguation in information retrieval revisited

Word sense disambiguation in queries

An effective approach to document retrieval via utilizing WordNet and recognizing phrases